Pdftk

From Freephile Wiki
Jump to navigation Jump to search

PDF Toolkit, or pdftk for short, is a great free software command-line program for manipulating documents in the Portable Document Format (PDF). To help regular users while also supporting the author and his free software work PDF Labs now also offers (graphical) desktop versions. PDFTk Free will merge and split pdfs. PDFTk Pro will do other processing and costs a mere $3.99.

Manual[edit | edit source]

https://www.pdflabs.com/docs/pdftk-man-page/

Examples[edit | edit source]

http://www.pdflabs.com/docs/pdftk-cli-examples/

Discard the cover page of a pdf

pdftk wCover.pdf cat 2-end output NoCover.pdf

Collating two-sided documents[edit | edit source]

2-sided document? No problem. Scan the original face side up first (odd pages); then flip it over and scan the second (even pages). Astute people will recognized that the second document is in reverse order compared to the first document. pdfTK can not only Merge the two documents, but ALSO can reverse the second document during collation so that the pages are in order.

pdftk A=my.even.pdf B=my.odd.pdf shuffle A Bend-1 output my.full.pdf

In our example, We specify documents handles using 'A' and 'B' to make it easier to refer to them. The operator "shuffle" acts like "cat" but means to collate the documents like shuffling a deck of cards. Using the 'A' and 'B' handles, we can also specify a range, and by reversing the range that 'B' should be read from the "end" to "page 1" using the handle "Bend-1".

Discard blank pages[edit | edit source]

If you have a scan that added blank pages (every even page), and you want to get rid of those, you would ask pdftk to 'cat' pages 1-end (but only the odd ones) and 'output' that to the file of your choice.

pdftk ~/Desktop/DOC033115.pdf cat 1-endodd output ~/Desktop/ProofOfLearning.pdf

Cleaning up Bank Statements[edit | edit source]

This is a long one, because Jack Henry sucks.

# first download the statements, manually inserting a month digit for each one
# then rename the whole batch to something more intelligent
rename 's/E-Statement_/2014_ifs_/' E-Statement*
# turn on extended globbing
shopt -s extglob
# now you should be able to see the files you want
ll 2014_ifs_??.pdf
# use pdftk to combine them
pdftk 2014_ifs_??.pdf cat output 2014_ifs_summary.pdf
# and then keep only the pages you want
pdftk 2014_ifs_summary.pdf cat 1 3 5 7 9 14 15 17 18 20 22 24 26 28 output 2014_ifs_condensed.pdf
# and delete the monthly statement clutter
rm 2014_ifs_??.pdf
# unset 
shopt -u extglob