Convert PDF to Text on Linux

Convert PDF to Text on Linux

Some people might encounter the experience that they can’t view a pdf file they want from a terminal when no GUI is available. This is when pdftotext comes in handy. This utility allows you to export a pdf file to a plain text format and view from any text editor. It also allows you to export only certain parts of the pdf file. Pdftotext should be installed by default on Ubuntu hardy. But just in case it is not on your machine, you can install it easily:
[email protected]$ sudo apt-get install poppler-utils
Now you are ready to convert the pdf files. Convert a file myfile.pdf to myfile.txt:
[email protected]$ pdftotext myfile.pdf myfile.txt
You can also omit the last parameter:
[email protected]$ pdftotext myfile.pdf
and pdftotext should be smart enough to figure out the new file name is myfile.txt by default. Specifies to convert from page 2 onwards:
[email protected]$ pdftotext -f 2 myfile.pdf
Specifies to convert up to page 3:
[email protected]$ pdftotext -l 3 myfile.pdf
Set the end of line format to either unix, dos or mac:
[email protected]$ pdftotext -eol unix myfile.pdf
To see more help for pdftotext:
[email protected]$ man pdftotext

Leave a Reply

Your email address will not be published. Required fields are marked *

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!