I have a couple of databases and all are mainly pdf’s.
Can anyone recommend a decent pdf compressor and is there a possible workflow where pdf that are already in DT3 can be compressed (or do I need to take therm out to compress and then put back in)?
…or is compression not recommended for OCR/search reasons?
You can run it from DT using “Open With…” It does not act on the original document but rather makes a new document which you can then add to DT3 using the share sheet and Group selector. Then you can move the original to DT3 trash using normal means.
Someday I’ll integrate it into an Apple Script (or something) and make a DEVONthink Smart Rule… Starter for that by Dr. Drang Reducing the size of large PDFs - All this
Hope Squeezer developer fixed copy and paste issue with Brave browser generated PDF. I had lots of squeezed PDF files that couldn’t copy and paste information. He said needing time to fix because Brave encodes fonts differently and Preview misinterprets them. Stoped using PDF Squeezer for >1 year in case same issue with other browsers too.
Generating PDF from Brave wasn’t an issue. PDFs can be copy and paste. Problems arise after going through PDF Squeezer where data from a number of PDFs can’t be copied and paste.
I have used IrisCompressor, which came bundled with a scanner. It worked, though my recollection is that it over-compressed files, reducing the quality of images a bit too far. I stopped using it several years ago when DT’s PDF generator was improved. If I wanted to shrink a big file from years ago, I would probably start by applying DT’s OCR to it, to see whether that did anything useful. Sometimes it does.
I have a workflow that uses PDF Pen Pro. I’m not sure how to fit the compression part in, but I’m sure you could find something that works Find AppleScript below. Hope it works out.
tell application "PDFpenPro"
open theFile as alias
tell document 1
ocr
repeat while performing ocr
delay 30
end repeat
delay 1
close with saving
end tell
tell application "PDFpenPro"
quit
end tell
end tell
Please enclose code blocks in three backquotes like so
```
Code goes here
```
That makes sure that people can copypaste it with the than available buttons.
I’ve used PDF Shrink with success for many years. It’s certainly not free, but it works well. I integrated it with a toolbar button in DT. AAppleScript (which I’m sure could be improved) below.
I moved from PDFPen Pro to PDF Expert a while back. PDF Expert is unfortunately not scriptable, but it has a “Reduce File Size…” command that seems to do a decent job.
tell application id "com.devon-technologies.think3"
try
set this_selection to the selection
if this_selection is {} then error "Please select one or more PDFs to shrink."
repeat with this_item in this_selection
if the type of this_item is equal to PDF document then
try
set this_image to the image of this_item
tell application "PDF Shrink"
set this_file to process this_image using "DEVONthink medium quality"
end tell
synchronize record this_item
end try
end if
end repeat
on error error_message number error_number
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
“DEVONthink medium quality” refers to a preset I’ve created in PDF Shrink, one of whose settings is to overwrite the original file.
Neither did I mention this syntax anywhere (nor do I think that this is a common way to specify language for code highlighting, if that’s what you had in mind), nor do I know what you mean by “doesn’t work”, since you did use this syntax and the code appears as code.
I’m sorry if that came across as snarky. That was not my intention. I meant merely to note jokingly that we don’t get the correct highlighting for AppleScript.
The option to specify a language for code blocks appears in Fletcher Penney’s MultiMarkdown (see the “Raw source” section) and probably isn’t standard Markdown (and yes I know there are many variants too). In any case I can’t get it to work in Fletcher’s own MultiMarkdown Composer.
In fact I should have kept quiet altogether. I see that PDF Squeezer costs a lot less than PDF Shrink, and that Blanc has already addressed the scripting issue much better than I could.
I’m not sure that discourse uses MultiMarkdown. I was thinking of the usual HTML highlighters like Prism. Those use language-xxx or simply xxx after the code fence start.
In any case, I’ve never noticed any special or useful difference in the highlighting here, regardless of the language I specified.
Edit Found this on the discourse site:
It seems that they want a space between the codeblock fence and the language name. Also, the possible languages are determined by the site admin and must probably be written in all lower case.