Suggestion: Provide PDF/Images Compression/Shrink features in DEVONthink

Provide ability for DEVONthink to “shrink/compress” PDF and Image files as part of the import of PDFs or after the fact by calling a Smart Rule which might say something like “if image size > X, then compress to [some description of the degree of compression required]”

I do this manually for most every PDF that i put into DEVONthink using PDFPen. I’ve tried to write some automation for it, but never has been reliable. I know there are a few products out there that can maybe do it better but I’ve not pursued. Just seems to be a natural feature if it were put into DEVONthink, say version 4.

The same feature for image files linked to Markdown (those in “Assets”) would be useful too.

Did you try with the sips command line utility? It’s supposedly working for JPEG and a lot of other formats (though not WebP) and could be called from a smart rule through do shell script or doShellScript.

Similarly, one could use ghostscript to compress PDFs. However, ghostscript is apparently no longer delivered with macOS and must therefore be installed with brew or something similar. Again, that could be done inside a smart rule with a script and do shell script/doShellScript.

OTOH: Image quality is only available for some formats, eg JPEG and WebP. And I doubt that a PDF not containing any image data can be compressed a lot (internally, that is – one can of course zip it).

1 Like

Well, no I haven’t tried the “sips” command line utility, but I have used ghostscript and others over the years.

The point of my “feed back feature-request” was to do just that. Sort of done with fiddling with the computer and now trying to be focused on the content of all the stuff in my DEVONthink databases and produce some useful stuff out.

Perhaps if it’s as simple as using “sips” the feature could be included in DEVONthink someday.

As far as I remember that’s even doable via Automator although nobody seems to remember/use this any longer :joy:

I came up with a script for image compression:

const fileSizeThreshold = 1000000; /* 1 MByte */

(() => {
  if (Application.currentApplication().name() !== "DEVONthink 3") {
    const app = Application("DEVONthink 3");  
    performsmartrule(app.selectedRecords());
  }
})()

function performsmartrule(records) {
  const app = Application.currentApplication();
  app.includeStandardAdditions = true;
  const DT = Application('DEVONthink 3');
  records.filter(r => r.type() === 'picture' && r.size() > fileSizeThreshold).forEach(r => {
    const filename = r.path();
    const targetGroup = r.locationGroup();
    const outname = r.nameWithoutExtension();
    const outpath = `/tmp/${outname}.jpg`
    const ratio = (fileSizeThreshold / r.size()) < 0.85 ? fileSizeThreshold/r.size() * 100.0 : 85;
    const command = `sips -s format jpeg -s formatOptions ${ratio} --out '${outpath}' '${filename}'`;
    app.doShellScript(command);
    DT.import(outpath, {to: targetGroup});
    app.doShellScript(`rm '${outpath}'`);
  })
}

How it works:
You can either use it as an external script in a smart rule or in Script editor or osascript. In the latter cases, it works with the records currently selected in DT. Using filter, it skips all files neither images nor larger than the fileSizeThreshold constant (in bytes). For the remaining records, it calculates a compression ratio, which is at least 85 percent. May be that’s not the best idea… Then the sips command line is built, which saves the new image in the /tmp folder. Finally, this new image is imported in the same group as the original image and then deleted from /tmp. Quite straightforward stuff, but…

…When I run that on a HEIC image with 1.2 MB, I get a JPG with 2.2 MB. That’s roughly what they say here: HEIC is half the size of JPG. Only if I run the code again on this JPG do I get an image smaller than the original at about 700 KB (with a JPG quality of about 45%, though)

I see two ways out of this

  • either ignore HEIF/HEIC images: they’re already small with a good quality. Converting them to JPG will reduce quality a lot for a noticeable smaller size.
  • or run the process recursively until the final image is smaller than the threshold.

Conclusion: Automatic image compression in DT is feasible with onboard tools. But it’s not a failsafe procedure. And the script could profit from some more thought, like how to handle PNGs. Personally, I wouldn’t do anything about them because they’re mostly used for pictures with a limited number of colors, like screenshots etc. And they’re already fairly small – converting them to JPG might not make them a lot smaller but reduce quality.

Here’s an AppleScript to compress images using ImageMagick’s convert utility. Place this in the relevant subfolder of the scripts folder (script icon/Open Scripts Folder). I’ve placed my copy in ‘Contextual Menu’, which means it will appear in the scripts submenu when right clicking on an item, or selection of items.

Before using, make sure to install imagemagick with Homebrew by running these commands in the terminal:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"; brew install imagemagick

-- Check if DEVONthink is running
tell application "System Events"
	if not (exists process "DEVONthink 3") then
		display dialog "DEVONthink is not running. Please open DEVONthink and try again."
		return
	end if
end tell

-- Get the selected documents in DEVONthink
tell application id "DNtp"
	set selectedDocs to selection
	
	if (count of selectedDocs) < 1 then
		display dialog "Select one or more documents to compress."
		return
	end if
end tell

-- Define the compression quality (adjust as needed)
set compressionQuality to 70 -- Change this value to adjust compression quality (0-100)

-- Create a temporary folder in the user's home folder
set userTempFolder to (path to home folder as text) & "tempCompressedImages" -- or whatever you want
set unixUserTempFolder to POSIX path of userTempFolder -- this is crucial: AppleScript uses HFS style paths like usr:bin:local whereas the bash shell uses POSIX style, usr/bin/local
do shell script "mkdir -p " & unixUserTempFolder

-- Loop through selected documents and compress each one
repeat with docToCompress in selectedDocs
	-- Get the file path of the original document in DEVONthink
	set originalPath to path of docToCompress
	set unixOriginalPath to POSIX path of originalPath
	
	-- Extract the file extension
	set AppleScript's text item delimiters to "."
	set textItems to text items of originalPath
	set originalExtension to last item of textItems
	set AppleScript's text item delimiters to ""
	
	-- Check if the file extension indicates an image (JPEG or TIFF) -- add more image formats as needed
	if originalExtension is in {"jpg", "jpeg", "tiff", "tif"} then
		-- Create a temporary file path for the compressed image
		set tempExportPath to unixUserTempFolder & "/" & (name of docToCompress) & ".jpeg" -- Specify the output format
		set unixTempExportPath to POSIX path of tempExportPath
		
		-- Copy the original file to the temporary location
		do shell script "cp " & quoted form of unixOriginalPath & " " & quoted form of unixTempExportPath
		
		-- Compress the image using the `convert` utility with the updated quality value
		try
			-- This is the actual conversion. If you've installed convert using Homebrew, the path should be correct, but double-check this
			do shell script "/opt/homebrew/bin/convert " & quoted form of unixTempExportPath & " -quality " & compressionQuality & " " & quoted form of unixTempExportPath
		on error errMsg
			display dialog "Error: " & errMsg
			return
		end try
		
		-- Move the compressed file to the original location using the filesystem
		do shell script "cp " & quoted form of unixTempExportPath & " " & quoted form of unixOriginalPath
		
	end if
end repeat

-- This command will delete the temporary folder after processing. Comment it out in testing
do shell script "rm -rf " & quoted form of unixUserTempFolder

-- Synchronize DEVONthink to update changes
tell application id "DNtp" to synchronize
1 Like

And here’s one to compress PDFs (using PDF Sqeezer). You can install command line tools in PDF Squeezer->Settings->Automation

-- Get the selected PDF document in DevonThink
tell application id "DNtp"
	
	set selectedDocs to selection
	if (count of selectedDocs) is not 1 then
		display dialog "Select one PDF document to compress."
		return
	end if
	
	set docToCompress to item 1 of selectedDocs
	
	-- Check if the selected item is a PDF
	if (type of docToCompress is PDF document) then
		-- Get the path of the selected PDF document
		set inputPDFPath to path of docToCompress
		
		-- Construct the PDF Squeezer command
		set pdfSqueezerCommand to "/usr/local/bin/pdfs " & quoted form of inputPDFPath & " --replace"
		
		-- Execute the command
		do shell script pdfSqueezerCommand
		
		-- Display a dialog when compression is complete
		display dialog "PDF compression completed."
	else
		-- If the selected item is not a PDF, show an error message
		display dialog "Selected item is not a PDF document."
	end if
end tell

Theoretically, you cold put this in a loop, but the PDF compression process is more resource intensive than image compression and it could take a few minutes per PDF, so I think compressing one at a time works better.

1 Like

This script works great, thanks a bunch for sharing!

However I’ve found that the resulting compressed PDF’s are empty in the case of password protected PDF’s. Sadly these are not detectable with the “Encryption is On” filter.

Unfortunately I’m a noob and thus not allowed to upload examples or share links.

Does anyone happen to have a solution?

Best regards,

Tom

Welcome @tom_van_kan
What PDFs do you have that are password-protected that would need to be shrunken? Just from my experience, such protected PDFs usually contain few pages and almost all text.

The program doing the shrinking in the script is pdfs which is the command line interface for the 3-rd party app “PDF Squeezer”. I recommend you look in the options for pdfs to see how to decript:compress:re-encrpt and adjust the script accordingly. The encryption password must be known to you of course to do this.

I am impressed with “PDF Squeezer” and find it a good companion to DEVONthink, but I drag and drop pdf’s into and then compress to the degree I want, then saving back to DEVONthink. I’ve never tried encrypted files with the manual method, nor have I been able to get pdfs to work as “PDF Squeezer” throws an intelligible error message when I try to “authenticate” it.

Thank you kindly for your reply. In this case it’s a couple of manuals from hardware manufacturers, like for the Brother MFC-L8690CDW, which is 15+ MB in size

Thank you for your reply. I’m not so much looking to automate the decrypting of password protected files (by PDF Squeezer), I’m looking for the a way to exclude those files via DEVONthink rule filtering