When I started using DevonThink Pro my main frustration with it was the annoying OCR engine. Not that it’s not decent at creating the OCR layer in PDFs, but rather that it (drastically) reduces the quality of the file contents itself (or drastically increases the file size).
Luckily, Nitro PDF Pro (included with Setapp) does not have this issue, so I set up a Hazel automation on a dedicated folder in iCloud (called “DevonThink OCR inbox”) where I save PDFs and images that needs to be OCRed. The PDF automation in Hazel opens the file in Nitro and runs the OCR, then saves the file and moves it to the DT inbox. I don’t recall where exactly I found the AppleScript that are being used in the Hazel automation, however it was probably on the MPU forum.
The embedded script is the following:
tell application "Nitro PDF Pro"
open theFile as alias
-- does the document need to be OCR'd?
get the needs ocr of document 1
if result is true then
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
--In PDFpen, when no documents are open, window 1 is "Preferences"
--If other documents are open, do not close the App.
if name of window 1 is "Preferences" then
tell application "Nitro PDF Pro"
quit
end tell
end if
else
-- Scan Doc was previously OCR'd or is already a text type PDF.
tell document 1
close without saving
end tell
--In PDFpen, when no documents are open, window 1 is "Preferences"
--If other documents are open, do not close the App.
if name of window 1 is "Preferences" then
tell application "Nitro PDF Pro"
quit
end tell
end if
end if
end tell
This workflow works great whether I scan documents with my phone or download PDFs needing OCR on my Mac. However, I already have loads of PDFs inside DT that still needs OCR.
So I spent some time (despite it just being some minor tweaks that were needed) getting it to work in DT as a Smart Rule:
The updated AppleScript used for the Smart Rule is this:
on performSmartRule(theRecords)
log "Starting script"
tell application id "DNtp"
repeat with theRecord in theRecords
set thePath to path of theRecord
tell application "Nitro PDF Pro"
open (POSIX file thePath) as alias
-- does the document need to be OCR'd?
get the needs ocr of document 1
if result is true then
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
--In PDFpen, when no documents are open, window 1 is "Preferences"
--If other documents are open, do not close the App.
if name of window 1 is "Preferences" then
tell application "Nitro PDF Pro"
quit
end tell
end if
else
-- Scan Doc was previously OCR'd or is already a text type PDF.
tell document 1
close without saving
end tell
--In PDFpen, when no documents are open, window 1 is "Preferences"
--If other documents are open, do not close the App.
if name of window 1 is "Preferences" then
tell application "Nitro PDF Pro"
quit
end tell
end if
end if
end tell
end repeat
end tell
end performSmartRule
Figured I’d share it here in case others find it helpful.
Note: I’m an AppleScript novice, so if there’s anything that could be improved upon, please let me know!