Wanted: Script or workflow to compress PDFs

I am trying to cut down on paper by scanning many documents with ScanSnap. In order to avoid logjam, I am scanning without OCR and converting later. I am scanning at “better” setting to make the OCR more accurate. That’s 200 dpi, which is not that high resolution but I still have hundreds of PDF+Text files that are larger than they need to be. I am being stingy with space because I don’t want to exceed Dropbox 50GB limit.

All I need is a workflow that will (1) open the selected document in Acrobat, (2) use Acrobat to reduce file size, (3) save the compressed file in the same location (replacing the original). This process when done manually will typically reduce a 300MB file to 12MB while keeping it readable on screen.

I have gone over the Automator actions and AppleScript dictionaries, but I can’t find anything that will do any of these steps. Surely someone out there has had the same problem and has created a nifty solution – right? Anyone?

Did you try the “Compress Images in PDF Documents” Automator action of Mac OS X? Does it reduce the filesize as desired?

Since you are using Acrobat, scan your documents to a folder, and use Acrobat’s command that OCRs multiple files.

  • In ScanSnap settings: Application > Scan to File, and set Save to a folder (say, “My Scans” on the desktop). Do your batch scans, which will be collected in “My Scans”.
  • In Acrobat: Document > OCR Text Recognition > Recognize Text in Multiple Files.
  • That command has an option to select a folder: select “My Scans”.
  • Set the rest of the OCR options (they will be retained the next time you do this), and scan. As you specified, one of the options is to save the OCRd document into the same folder, overwriting the original.

Afterward, index or import the contents of “My Scans”.

Christian, I tried using “Get selected records” (Devon action) followed by “Compress images…” (PDF action), but nothing happens. I think I am not picking up the documents to pass to the PDF action.

Korm, I had not realized how many batch processing options are available in Acrobat Pro. It looks like there may be some good possibilities there.

Thanks.

The “Compress Images…” action requires files, therefore you could e.g. insert the “Export Records” action right before the “Compress Images…” action.

I finally had the time to finish working on this. Long story short: I created a ColorSync Quartz Filter as described in http://meyerweb.com/eric/thoughts/2010/02/25/better-pdf-file-size-reduction-in-os-x/. The custom filter is essential to getting a good balance of compression and quality. (Bonus: No Acrobat required.)

Then I created a workflow:
(1) DTPO: Get Selected Records
(2) DTPO: Get Item from Records
(3) PDF: Apply Quartz Filter to PDF Documents.
The addition of Step 2 was what I needed to operate on the underlying files without having to export. I saved it to the DTPO scripts folder and it works like a charm.