CREATE TABLE test_multi (a string, b string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ( "field.delim"="#|", "collection.delim"=":", "mapkey.delim"="@" );Query all columns is OK:
select * from test_multi; +---------------+---------------+ | test_multi.a | test_multi.b | +---------------+---------------+ | eric | test more | +---------------+---------------+ 1 row selected (1.58 seconds)However, query a single column will get the following error in HS2:
select a from test_multi; Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126) ... 22 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1953) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:304) ... 24 moreThis happens in CDh5.3.3, which ships Hive 0.13, and I am not sure whether it also applies to CDh5.4.x and CDh5.5.x. This is caused by class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe was not loaded to Hive’s UDF list when HiveServer2 starts up. We need to copy the JAR file that contains this class to Hive’s AUX directory. Steps as follows: 1) Locate the AUX directory for HiveServer2, if you don’t have one create one and update Hive’s configuration through Cloudera Manager. If you don’t use Cloudera Manager, simply create a directory on HiveServer2 host, in my case is /hive/jars. 2) Create a symlink to file /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar (or if not using Cloudera Manager, /usr/lib/hive/lib/hive-contrib.jar) from within /hive/jars
ln -s /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar /hive/jars/hive-contrib.jar3) If you don’t use Cloudera Manager, add the following:
to hive-site.xml for HiveServer2. If you use Cloudera Manager, simply go to step 4. 4) Restart HiveServer2. This should be able to remove the error we saw earlier in the post and get Hive query working. Hope this helps.
Try this :https://github.com/sanjivsingh/FixedLengthAndDelimitedSerde
– Supported both fixed length and delimited data serialization/deserialization
– It support both case in single definition.
i have error like that. should i configure in all host in cluster?
Thanks for visiting my site. You just need to set the config change for HiveServer2, and you don’t need to update all hosts in the cluster.
Hope that helps.
Other hosts do have hive-site.xml file under /etc/hive/conf directory, but they would be just the gateways for Hive service, meaning that they are just Hive client. The AUX JARs are only applicable to HiveServer2 service, not the clients. So you only need to update it on the HiveServer2 host(s).
Hope that makes sense.