Team OS : Your Only Destination To Custom OS !!

Welcome to TeamOS Community, Register or Login to the Community to Download Torrents, Get Access to Shoutbox, Post Replies, Use Search Engine and many more features. Register Today!

Tips & Tricks [Unix/Linux] Manipulating PDF's

mobi0001

The Power Is Yours!!!
Uploader
Power User
✅ Verified Member
Member
Downloaded
62.3 GB
Uploaded
11.3 TB
Ratio
186
Seedbonus
975
Upload Count
89 (104)
Member for 4 years
This thread is just to share couple of excellent applications, open source and for Unix/Linux systems (prolly have windows package too) for arranging PDF files, and or combining, bursting PDF files.

We all know how these portable document files are, and so heavy too. Unlike PDF24, which is only for windows, this application is a great tool for people like myself, who prefer GUI over CLI (though, I admit, CLI is more intuitive then GUI), but trust me, GUI has its niche and is better eyed compared to CLI.

For the most part, PDF documents are generally designed to be read, but not altered or modified (coming from old ages). They are useful for sharing information in a consistent format, but manipulating the contents of a PDF document can be medial to difficult. Many applications which display PDFs do not provide the necessary tools for editing or rearranging the pages of these documents.

First, let's look at a desktop application for managing PDFs.

PDFArranger
can generally be found in the repositories of modern versions of most distributions.

The application features a pleasantly simple interface. At the top of the window is a menu bar where we can choose to import existing documents and perform simple manipulations on documents or pages we have selected. Below this, the bulk of the window displays the document pool. Any PDF we import into PDF Arranger is divided into individual pages. These pages are displayed, in order, in the pool. If we import multiple documents, their pages will all be shown in the pool.

We can then use the mouse to drag and drop a page into a new order in the pool. We can also use the Shift and arrow keys to highlight groups of pages to manipulate. Once a page (or multiple pages) have been highlighted, we can choose an action to perform on them. We can delete pages from the pool, rotate them, or crop their edges. Then we can re-arrange them into the order we like.

Once we have re-arranged and manipulated the pages into the form we want, we can either select a range of pages to export into a new PDF, or we can export the entire pool into one new, big PDF.

QBAz2R.png

PDF Arranger -- Manipulating a single PDF page

**********************************************************************************************************************************************************************************

There are also command line tools for working with PDFs. My favourite is a tool called PDFtk. Now, to be accurate, there is a desktop version of PDFtk, but I prefer PDF Arranger's desktop interface while I like the flexibility and scriptability of the command line version of PDFtk.

The PDFtk program is run with a series of parameters. Typically we start by passing PDFtk a list of documents we want to manipulate. Then we provide it with an action command, indicating what kind of operation we are performing. Then we provide the keyword output followed by the name of the file we are going to create. This output file contains our changes.

In its simplest form, PDFtk does not need to be given any action command. We can, technically, give it an input file to work on, the keyword output, and the name of a new file to create. This effectively makes a new copy of our PDF. This can be useful either for testing purposes or to try to repair any damaged meta-data in the original document. Here is what a PDFtk command looks like when we want to just make a clean copy of the original file:
pdftk original.pdf output new-file.pdf
While PDFtk supports a lot of action commands and options, I want to focus on five. These are called:
  • cat - merge together a series of pages from one or more documents
  • shuffle - collate multiple pages, usually from multiple files
  • burst - expand one PDF document into multiple, one-page documents
  • rotate - turn pages on their sides
  • unpack_files - extract files embedded in a PDF and save them in a directory.
Let's look at a few examples of these action commands being used. This first example uses the cat action command. This allows us to either insert pages of documents into one big document, or possibly remove a series of pages. Here we append one PDF to another one, making one long document:
pdftk original-one.pdf original-two.pdf cat output new-long-file.pdf
We can also specify a range of pages to collect and place in the output file. For instance, here we take the first 5 pages, and then every page from page 20 until the end of the document. All of these end up in one final PDF with the original pages 6 through 19 removed.
pdftk original-file.pdf cat 1-5 20-end output new-file.pdf
Here is one more example where we simply reverse the order of all the pages in a document, handy for when we fed pages the wrong way around into the scanner:
pdftk backward-file.pdf cat end-1 output proper-order.pdf
The shuffle command works in a similar way to cat, but it takes the first page of each specified file/range in parallel and places them in the output file. This effectively collates the original files. The next example effectively merges two documents, placing their pages together as if they were shuffled together like a deck of cards:
pdftk left-pages.pdf right-pages.pdf shuffle new-book.pdf
The next example takes one PDF file and creates a new PDF for each page included in the original. When it is done we end up with files named pg_0001.pdf, pg_0002.pdf, pg_0003.pdf, etc:
pdftk original-file.pdf burst
The rotate command is fairly straight forward. It turns a PDF's pages around, usually 90 degrees left or right. We can also tell PDFtk to rotate a page to an absolute position using the four compass directions: north, south, east, and west. For example, this command rotates every page 90 degrees to the right:
pdftk original-file.pdf rotate 1-endright output new-file.pdf
While this next example will turn all the pages in the document upside down. This is again helpful if every copy was scanned upside down and we want to correct it:
pdftk upside-down-file.pdf rotate 1-endsouth output fixed-file.pdf
Finally, the unpack_files command extracts the file elements from a PDF and places them in a directory. In this case we dump the contents of the PDF into a new directory called target-directory:
pdftk original-file.pdf unpack_files output target-directory
The PDFtk software can do more operations, including compressing files and working with passwords. However, these are probably the most commonly used operations.

QBAApU.png

PDFtk in action (free version)

**********************************************************************************************************************************************************************************
Necessary Links:
1. github > jeromerobert/pdfarranger
2. pdflabs DOT com/tools/pdftk-the-pdf-toolkit/
 

PsyTom

Power User
✅ Verified Member
Member
Downloaded
1.4 TB
Uploaded
502.1 TB
Ratio
367.69
Seedbonus
1,662,990
Upload Count
0 (0)
Member for 3 years
thank you for sharing this information, very interesting.
 

RedDove

⭐ VIP
Power User
✅ Verified Member
Member
Downloaded
118.2 GB
Uploaded
41.8 TB
Ratio
362.48
Seedbonus
1,856,791
Upload Count
0 (0)
Member for 9 years
:) Thanks for sharing that @mobi0001.
I bookmarked it so I can go over it in detail later. :)
 
Top