Enabling Snappy Compression Support in Hadoop 2.4 under CentOS 6.3

Enabling Snappy Compression Support in Hadoop 2.4 under CentOS 6.3

After Hadoop is install manually using binary package on CentOS, Snappy compression is not supported by default and there are extra steps required in order for Snappy to work in Hadoop. It is straightforward but might not be obvious if you don’t know what to do. Firstly, if you are using 64 bit version of CentOS, you will need to replace the default native hadoop library which is shipped with Hadoop (it is only compiled for 32 bit), you can try to download it from here, and then put it under “$HADOOP_HOME/lib/native” directory. If there is a symlink, you can just remove the symlink with the actual file. If it still doesn’t work, then you might need to compile yourself on your machine which is out of scope of this post, you can follow instructions on this site. Secondly you will need to install native snappy library for your operating system (CentOS 6.3 in my case):
$ sudo yum install snappy snappy-devel
This will create a file called libsnappy.so under /usr/lib64 directory, we need to create a link to this file under “$HADOOP_HOME/lib/native”
sudo ln -s /usr/lib64/libsnappy.so $HADOOP_HOME/lib/native/libsnappy.so
Then update three configuration files: $HADOOP_HOME/etc/hadoop/core-site.xml





And finally add the following line into $HADOOP_HOME/etc/hadoop/hadoop-env.sh to tell Hadoop to load the native library from the exact location:
export JAVA_LIBRARY_PATH="/usr/local/hadoop/lib/native"
That’s it, just restart HDFS and Yarn by running:
Now you should be able to create hive tables with Snappy compressed.



Leave a Reply

Your email address will not be published. Required fields are marked *

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!