CREATE TABLE test (a int) PARTITIONED BY (p string) STORED AS TEXTFILE;Then I issued a dynamic partitioning query:
INSERT OVERWRITE TABLE test PARTITION (p) SELECT COUNT(1), 'p1' FROM sample_table;And then in another terminal, I check the locking status:
SHOW LOCKS test;And the result is:
[email protected] SHAREDFinally in another terminal, I issued the same update again:
INSERT OVERWRITE TABLE test PARTITION (p) SELECT COUNT(1), 'p1' FROM sample_table;I can see that both INSERT OVERWRITE are running concurrently, which is not correct. This will easily lead to data corruption if not carefully enough. This is confirmed under CDH5.3 and latest CDH5.4. I have filed a JIRA issue for engineering team to fix. There is no simple workaround at this stage, just need to be mindful of this issue until the next update comes.
Related articles