Automatically Prune PDF Pages?

I would like to automatically prune certain paginated PDF files automatically when they arrive in DT3. A script to prune and discard all pages after page four would work best for me in most cases. Any ideas on the best method to accomplish this?

@pete31 recently wrote a script which would split PDFs at a certain text; it would probably be possible to rework that to split after a certain number of pages if the script is no use to you as is; it would certainly be possible to rework it to effectively prune rather than split (i.e. by dumping rather than saving the section after the split). See here for the original art work.

Thank you, @Blanc, for your response and reference to @pete31’s script. I’ll take a look to determine whether it will work for me.

I do this in Hazel using CPDF .

I think something like

cpdf in.pdf 1-4 -o out.pdf

would do what you need. Note that the out file name can be the same as the in file name, in which case the file is overwritten with the first four pages only.

1 Like

Thank you @rolian! I use Hazel a lot, but had not gotten around to researching whether it offers a solution for me. I’ll give your suggestion a try this weekend after I return home from a short trip.

I think you can also run a shell script within a DT rule if you prefer not to use Hazel - I am just more familiar with Hazel in terms of processing downloaded files. In Hazel, you’d run the following shell script:

/Applications/cpdf $1 1-4 -o $1

I’m thinking going the DT rule route makes sense as Hazel is not used in this workflow.

You provided a link to the Coherent PDF Command Line Tool. I need to download and install this in order for your script to function…right?

Yes, and it looks an interesting library (based on Python, which I use al the time).

This might cause problems with file names containing blanks. To avoid that, enclose the parameters in single quotes:
/Applications/cpdf '$1' 1-4 -o '$1'