Impala nested inline view produces incorrect result when referencing the same column implicitly

This article explains how to workaround the impala bug: IMPALA-2643. To see what the issue is, have a look at the test case below: We are expecting the two returned numbers should be the same of value “2”. This happens when nested inline view referencing the same column name implicitly. …

SELECT * query triggered Map Only job under CDH5.5.1, but not from CDH5.3.x

This article explains why a map only job was launched while running a simple “SELECT * FROM <table>” query in CDH5.5.x, while same query did not need any MapReduce task in CDH5.3.x. After upgrading to CDH5.5.1, while running “SELECT * FROM <table>” query from Beeline, a map only job was …

Avro Data Types

This article explains the supported Data Types by Avro. Avro currently supports the follow Primitive Types: More details can be found on AVRO’s Apache documentation page. If you intend to create a column with TINYINT or SMALLINT for a AVRO table, you will get Undefined name: “TINYINT” error in CDH5.4.x, …

Impala query memory estimates are wrong for a SELECT query with LIMIT clause

This article explains the workarounds to by pass the Impala memory estimation issue when doing a simple “SELECT * FROM table LIMIT 10” query. A request memory estimate error is being thrown if the Impala query has ‘LIMIT’ clause. The same query is working properly if the “LIMIT” clause is …

Sqoop Hive Import Failed After Upgrading to CDH5.4.x or CDH5.5.x

This article explains the root cause of Sqoop Hive Import failure after upgrading to CDH5.4.x or CDh5.5.x and the solution to fix the issue. After upgrading to CDH5.5.x from CDh5.3.x, Sqoop Hive Import got the following error: After upgrading to CDH5.4.x from CDh5.3.x, Sqoop Hive Import got the following error: …