To compile and install Hadoop’s LZO compression library on CentOS, following the steps below:
Download hadoop LZO source from Kevin’s Hadoop LZO Project.
If you are using Ant version of < 1.7, please download latest ant binary pacakge from Apache Ant, otherwise you will get the following error when compiling:
BUILD FAILED /root/kevinweil-hadoop-lzo-4c5a227/build.xml:510: Class org.apache.tools.ant.taskdefs.ConditionTask doesn't support the nested "typefound" element.
Install lzo-devel:
yum install lzo-devel.x86_64 yum install lzop.x86_64
Unzip Apache Ant and Hadoop LZO Compression Library to somewhere you have access to.
unzip kevinweil-hadoop-lzo-4c5a227.zip cd kevinweil-hadoop-lzo-4c5a227
Do the following if your ant version is less than 1.7:
export ANT_HOME=
Then
compile-native tar
In my case is
~/apache-ant-1.8.2/bin/ant compile-native tar
Carefully check the compiler output, if you see errors like this:
[exec] checking for strerror_r... yes [exec] checking whether strerror_r returns char *... yes [exec] checking for mkdir... yes [exec] checking for uname... yes [exec] checking for memset... yes [exec] checking for JNI_GetCreatedJavaVMs in -ljvm... no [exec] checking jni.h usability... [exec] configure: error: Native java headers not found. Is $JAVA_HOME set correctly? [exec] no [exec] checking jni.h presence... no [exec] checking for jni.h... no BUILD FAILED /..../kevinweil-hadoop-lzo-4c5a227/build.xml:247: exec returned: 1
Then you will need to find your java path and update build.xml file and add “JAVA_HOME” setting on line 247, in my case is “/usr/lib/jvm/java/”
and re-run the compiler again
compile-native tar
If everything goes well, you should get “BUILD SUCCESSFUL” message at the end of the compile process.
Now do
ls -al build
in the current directory and you will see the following files generated:
drwxr-xr-x 9 root root 4096 Oct 19 17:21 . drwxr-xr-x 6 root root 4096 Oct 19 17:21 .. drwxr-xr-x 4 root root 4096 Oct 19 17:21 classes drwxr-xr-x 3 root root 4096 Oct 19 17:21 docs drwxr-xr-x 6 root root 4096 Oct 19 17:21 hadoop-lzo-0.4.14 -rw-r--r-- 1 root root 62239 Oct 19 17:21 hadoop-lzo-0.4.14.jar -rw-r--r-- 1 root root 1824851 Oct 19 17:21 hadoop-lzo-0.4.14.tar.gz drwxr-xr-x 5 root root 4096 Oct 19 16:59 ivy drwxr-xr-x 3 root root 4096 Oct 19 17:12 native drwxr-xr-x 2 root root 4096 Oct 19 17:12 src drwxr-xr-x 3 root root 4096 Oct 19 17:12 test
The most important one is hadoop-lzo-0.4.14.jar which can be copied to hadoop’s library directory and ready to be used.
Hi, from what repo are you getting the lzo rpms? I get a “no package … available”. Thanks
Hi Antonio, it’s been a long time since I install it on the virtual machine which is no longer available. I believe that I did not modify the repository on the CentOS, so whatever defaulted to the system I guess.