Beeline options need to be placed before “-e” option

Beeline options need to be placed before “-e” option

Recently I needed to deal with an issue that users tried to specify “–incremental=true” as beeline command line option, due to the issue that beeline failed with OutOfMemory error when fetching results from HiveServer2. This option should help with the OOM problem, however it did not in this particular case. The command was run as below:
beeline --hiveconf mapred.job.queue.name=queue_name --silent=true 
-u 'jdbc:hive2://:10000/default;principal=hive/@' 
--outputformat=csv2 --silent=true -e 'select * from table_name' 
--incremental=true > output.csv
It failed with the following error:
org.apache.thrift.TException: Error in calling method FetchResults
    at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1271)
    at com.sun.proxy.$Proxy8.FetchResults(Unknown Source)
    at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:363)
    at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:42)
    at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)
    at org.apache.hive.beeline.Commands.execute(Commands.java:826)
    at org.apache.hive.beeline.Commands.sql(Commands.java:670)
    at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)
    at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:716)
    at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:753)
    at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
    at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.lang.Double.valueOf(Double.java:521)
    at org.apache.hive.service.cli.thrift.TDoubleColumn$TDoubleColumnStandardScheme.read(TDoubleColumn.java:454)
    at org.apache.hive.service.cli.thrift.TDoubleColumn$TDoubleColumnStandardScheme.read(TDoubleColumn.java:433)
    at org.apache.hive.service.cli.thrift.TDoubleColumn.read(TDoubleColumn.java:367)
    at org.apache.hive.service.cli.thrift.TColumn.standardSchemeReadValue(TColumn.java:318)
    at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
    at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
    at org.apache.thrift.TUnion.read(TUnion.java:138)
    at org.apache.hive.service.cli.thrift.TRowSet$TRowSetStandardScheme.read(TRowSet.java:573)
    at org.apache.hive.service.cli.thrift.TRowSet$TRowSetStandardScheme.read(TRowSet.java:525)
    at org.apache.hive.service.cli.thrift.TRowSet.read(TRowSet.java:451)
    at org.apache.hive.service.cli.thrift.TFetchResultsResp$TFetchResultsRespStandardScheme.read(TFetchResultsResp.java:518)
    at org.apache.hive.service.cli.thrift.TFetchResultsResp$TFetchResultsRespStandardScheme.read(TFetchResultsResp.java:486)
    at org.apache.hive.service.cli.thrift.TFetchResultsResp.read(TFetchResultsResp.java:408)
    at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result$FetchResults_resultStandardScheme.read(TCLIService.java:13171)
    at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result$FetchResults_resultStandardScheme.read(TCLIService.java:13156)
    at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result.read(TCLIService.java:13103)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_FetchResults(TCLIService.java:501)
    at org.apache.hive.service.cli.thrift.TCLIService$Client.FetchResults(TCLIService.java:488)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1263)
    at com.sun.proxy.$Proxy8.FetchResults(Unknown Source)
    at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:363)
    at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:42)
    at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)
    at org.apache.hive.beeline.Commands.execute(Commands.java:826)
    at org.apache.hive.beeline.Commands.sql(Commands.java:670)
    at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)
    at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:716)
Error: org.apache.thrift.TApplicationException: CloseOperation failed: out of sequence response (state=08S01,code=0)
Error: Error while cleaning up the server resources (state=,code=0)
From this stacktrace, we could see that class “BufferedRows” was used, however, if “–incremental=true” was working, it should have used class “IncrementalRows” instead. This confirmed that “–incremental=true” option was not applied. After further experiment, I figured out that the “–incremental=true” needed to go before “-e” option for it to take effect. So run the command as below:
beeline --hiveconf mapred.job.queue.name=queue_name --silent=true 
-u 'jdbc:hive2://:10000/default;principal=hive/@' 
--outputformat=csv2 --silent=true ​--incremental=true
-e 'select * from table_name' > output.csv
helped to resolve the issue. I did not look into details on why, but it should help with anyone who might have similar issues.

Leave a Reply

Your email address will not be published.

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!