We know that Hive will create a partition with value “__HIVE_DEFAULT_PARTITION__” when running in dynamic partition mode and the value for the partition key is “null” value. However, depending on on the partition column type, you might not be able to drop those partitions due to restrictions in the Hive code. I don’t know whether this is a bug or not, but I have found a workaround. So what’s the scenario? Consider we have the following table setup:
hive> desc test;
OK
col1                    string                                      
p1                      int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
p1                      int                                         
Time taken: 0.188 seconds, Fetched: 7 row(s)

hive> show partitions test;                                                                                                
OK
p1=100
p1=__HIVE_DEFAULT_PARTITION__
Time taken: 0.164 seconds, Fetched: 2 row(s)
If you try to drop the default partition, you will get the following error:
hive> ALTER TABLE test DROP partition (p1 = '__HIVE_DEFAULT_PARTITION__');
FAILED: SemanticException Unexpected unknown partitions for (p1 = null)

hive> ALTER TABLE test DROP partition (p1 = __HIVE_DEFAULT_PARTITION__);
FAILED: ParseException line 1:64 missing \' at ')' near ''

hive> ALTER TABLE test DROP partition (p1 = null);
NoViableAltException([email protected][233:1: constant : ( Number | dateLiteral | timestampLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral $
 DecimalLiteral | charSetStringLiteral | booleanValue );])
        at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
        at org.antlr.runtime.DFA.predict(DFA.java:116)
        at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:5945)
        at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionVal(HiveParser_IdentifiersParser.java:10735)
        at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10580)
        at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:44526)
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:11195)
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.java:7748)
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:6960)
        at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2409)
        at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1586)
        at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1062)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:394)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1111)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1159)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1048)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1038)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:38 cannot recognize input near 'null' ')' '' in constant
In order to by pass this problem, we need to make use of the fact that the same query will work if the partition column’s type is “string”. So perform the following as the workaround:
-- update the column to be "string"
ALTER TABLE test PARTITION COLUMN (p1 string);

-- remove the default partition
ALTER TABLE test DROP PARTITION (p1 = '__HIVE_DEFAULT_PARTITION__');

-- then revert the column back to "int" type
ALTER TABLE test PARTITION COLUMN (p1 int);
Of course, to be absolutely safe, a back up of existing table should be performed prior to applying this workaround, either use “distcp” to copy to another cluster, or simply create a same structured table and then use “INSERT OVERWRITE … SELECT * FROM …” query to copy to another table. After applying the workaround, compare the two tables and if all good, the newly created table can be dropped.

5 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *