Compile Hadoop LZO Compression Library on CentOS

Compile Hadoop LZO Compression Library on CentOS

To compile and install Hadoop’s LZO compression library on CentOS, following the steps below:

Download hadoop LZO source from Kevin’s Hadoop LZO Project.

If you are using Ant version of < 1.7, please download latest ant binary pacakge from Apache Ant, otherwise you will get the following error when compiling:

BUILD FAILED
/root/kevinweil-hadoop-lzo-4c5a227/build.xml:510: Class org.apache.tools.ant.taskdefs.ConditionTask doesn't support the nested "typefound" element.

Install lzo-devel:

yum install lzo-devel.x86_64
yum install lzop.x86_64

Unzip Apache Ant and Hadoop LZO Compression Library to somewhere you have access to.

unzip kevinweil-hadoop-lzo-4c5a227.zip
cd kevinweil-hadoop-lzo-4c5a227

Do the following if your ant version is less than 1.7:

export ANT_HOME=

Then

 compile-native tar

In my case is

~/apache-ant-1.8.2/bin/ant compile-native tar

Carefully check the compiler output, if you see errors like this:

     [exec] checking for strerror_r... yes
     [exec] checking whether strerror_r returns char *... yes
     [exec] checking for mkdir... yes
     [exec] checking for uname... yes
     [exec] checking for memset... yes
     [exec] checking for JNI_GetCreatedJavaVMs in -ljvm... no
     [exec] checking jni.h usability... 
     [exec] configure: error: Native java headers not found. Is $JAVA_HOME set correctly?
     [exec] no
     [exec] checking jni.h presence... no
     [exec] checking for jni.h... no

BUILD FAILED
/..../kevinweil-hadoop-lzo-4c5a227/build.xml:247: exec returned: 1

Then you will need to find your java path and update build.xml file and add “JAVA_HOME” setting on line 247, in my case is “/usr/lib/jvm/java/”

   
   
   
   
   
   

and re-run the compiler again

 compile-native tar

If everything goes well, you should get “BUILD SUCCESSFUL” message at the end of the compile process.

Now do

ls -al build

in the current directory and you will see the following files generated:

drwxr-xr-x 9 root root    4096 Oct 19 17:21 .
drwxr-xr-x 6 root root    4096 Oct 19 17:21 ..
drwxr-xr-x 4 root root    4096 Oct 19 17:21 classes
drwxr-xr-x 3 root root    4096 Oct 19 17:21 docs
drwxr-xr-x 6 root root    4096 Oct 19 17:21 hadoop-lzo-0.4.14
-rw-r--r-- 1 root root   62239 Oct 19 17:21 hadoop-lzo-0.4.14.jar
-rw-r--r-- 1 root root 1824851 Oct 19 17:21 hadoop-lzo-0.4.14.tar.gz
drwxr-xr-x 5 root root    4096 Oct 19 16:59 ivy
drwxr-xr-x 3 root root    4096 Oct 19 17:12 native
drwxr-xr-x 2 root root    4096 Oct 19 17:12 src
drwxr-xr-x 3 root root    4096 Oct 19 17:12 test

The most important one is hadoop-lzo-0.4.14.jar which can be copied to hadoop’s library directory and ready to be used.

2 Comments

    1. Eric Lin

      Hi Antonio, it’s been a long time since I install it on the virtual machine which is no longer available. I believe that I did not modify the repository on the CentOS, so whatever defaulted to the system I guess.

Leave a Reply

Your email address will not be published.

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!