set hive.exec.compress.output=false;2. Create external table link to the raw data:
CREATE EXTERNAL TABLE IF NOT EXISTS `external_table` ( column1 int, column2 string, time int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://$hadoopHost/$hadoopDirectory'Replace $hadoopHost and $hadoopDirectory accordingly. 3. Create hive native table to hold the data you want:
CREATE TABLE IF NOT EXISTS `tmp_table` ( column1 int, column2 string, time int ) row format delimited FIELDS TERMINATED BY '\t' STORED AS TextFile;4. Then copy the data with the condition you want:
INSERT OVERWRITE TABLE `tmp_table` SELECT column1, column2, time FROM external_table WHERE time >= 111111 AND column2 = 'some-column-value';5. Now log into hadoop namenode and check the data:
hadoop fs -ls /user/hive/warehouse/default/tmp_tableAssuming that you use the “default” database, you will see a file named “000000_0”. You can simply use “vim” to view the file as it is a plain text file. Now you can copy the file to local disk and move around as you wish.