Split PDF by page count or TOC

Hi,

thought i might share my script i use to split PDF books by number of pages or table of content (e.g. chapters, sections etc).

(*
	Please change your folder and pdfsam path.
	*)

property g : "/books/" -- the group path where the files will be imported into
set p to "/Volumes/Additonal-Data/downloads/hf/" -- the path to the temporary directory holding the split files. MUST EXIST PRIOR TO EXECUTING THE SCRIPT.
set pdfsam to "/Applications/pdfsam.app/Contents/Resources/Java/bin/"
set tagName to "books" -- assigns tags
set importAgain to true -- automatically imports the splits into DT
set importLocal to false -- imports the splits into the same group as the selected file

-- Don't change these
set a to POSIX file p

clearOutputDirectory(a, p)

display dialog "Please enter a  number:" default answer 1
set inputNumber to text returned of result
set theNumber to inputNumber as number

if theNumber > 3 then
	set cmd_type to "NSPLIT -n " & theNumber
else
	set cmd_type to "BLEVEL -bl " & theNumber
end if

tell application "DEVONthink Pro"
	try
		set this_selection to the selection
		if this_selection is {} then error "Please select some contents."
		
		repeat with this_item in this_selection
			my clearOutputDirectory(a, p)
			set filePath to get path of this_item
			
			log message "Convert" & (quoted form of filePath)
			set cmd to "cd " & pdfsam & "; ./run-console.sh -f " & quoted form of filePath & " -o " & p & " -s " & cmd_type & " split"
			do shell script cmd
			
			if importAgain is true then
				my importFiles(a, g, this_item, tagName, p, importLocal)
				my clearOutputDirectory(a, p)
			end if
		end repeat
		log message "Splitting done!"
	on error errtext
		log message errtext
		display alert "Error occured"
	end try
end tell

on importFiles(a, g, this_item, tagName, p, importLocal)
	tell application "DEVONthink Pro"
		set t to (get name of this_item)
		if importLocal is true then
			set theGroup to current group
		else
			set theGroup to create location (my newDest(g, t))
		end if
		
		tell application "Finder" to set filelist to name of items of folder a
		
		repeat with split_item in filelist
			import (p & split_item) to theGroup
		end repeat
		
		set exclude from classification of this_item to true
		set exclude from see also of this_item to true
		set exclude from search of this_item to true
		set tags of this_item to tagName
		
		move record this_item to theGroup
	end tell
end importFiles

on newDest(g, t)
	g & t
end newDest

on clearOutputDirectory(a, p)
	try
		tell application "Finder" to set filelist to name of items of folder a
		repeat with split_item in filelist
			set tmp_file to quoted form of (p & split_item)
			do shell script "rm " & tmp_file
		end repeat
	on error t
		display alert "Error occured"
		display alert t
	end try
end clearOutputDirectory

You need http://www.pdfsam.org/. if you install it regularly, it should be in /Applications and should actually work without changing variable pdfsam.
create a folder somewhere on your system and point variable ‘p’ to it.

running the script, entering a number lower than 4 splits the pdf on bookmark level. 2 should be chapters, 3 sections. 1… not sure.
enter N (> 3) and it splits the pdf into smaller PDFs. each PDF has N pages (except maybe the last one which has less)

setting importAgain to true automatically imports the split pdfs into DT either into a pre-defined group or in the same group as the selected pdf. latter is controlled with importLocal

hope it helps and more importantly, works :wink:

cheers,
bosie