Month: <span>September 2016</span>

Month: September 2016

How to confirm Dynamic Partition Pruning works in Impala

This article explains how to confirm Impala’s new Dynamic Partition Pruning feature is effective in CDH5.7.x. Dynamic Partition Pruning is a new feature introduced from CDH5.7.x / Impala 2.5, where information about the partition is collected during run time and impala prunes unnecessary partitions in the ways that were impractical …


How to import BLOB data into HBase directly using Sqoop

Recently I was dealing with an issue that I was not able to import BLOB data correctly into HBase from Oracle database. All other columns were imported successfully, however, the BLOB column failed to appear in HBase table. My test table has three columns, ID:int, DATA_S:VARCHAR2 and DATA_B:BLOB. The following …


How to redirect parquet’s log message into STDERR rather than STDOUT

This article explains the steps needed to redirect parquet’s log message from STDOUT to STDERR, so that the output of Hive result will not be polluted should the user wants to capture the query result on command line. In Parquet’s code based, it writes its logging information directly into STDOUT, …


Beeline options need to be placed before “-e” option

Recently I needed to deal with an issue that users tried to specify “–incremental=true” as beeline command line option, due to the issue that beeline failed with OutOfMemory error when fetching results from HiveServer2. This option should help with the OOM problem, however it did not in this particular case. …


My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!