Thank you for your valuable input, Jim!
As mentioned this script is something that was posted on the MPU forum—years ago—before PDFpen was renamed Nitro. I just didn’t think to change the comments when I changed the app name in the script.
Great feedback!
I’ve updated the script used for the Smart Rule (with a bit of help from ChatGPT):
on performSmartRule(theRecords)
log "Starting script"
tell application id "DNtp"
repeat with theRecord in theRecords
set thePath to path of theRecord
my processFileWithOCR(thePath)
end repeat
end tell
end performSmartRule
on processFileWithOCR(thePath)
tell application "Nitro PDF Pro"
-- Open the file
open (POSIX file thePath) as alias
-- Perform OCR
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
-- Quit the application if no other documents are open
if name of window 1 is "Preferences" then
quit
end if
end tell
end processFileWithOCR
Yeah for sure. I know the limitations in DevonThink is caused by the OCR engine (which has been discussed elsewhere on the forum), and beyond changing the OCR engine there’s nothing the DT developers can do to fix the issue. For my purposes it’s important that the OCR process is lossless/does not resample the actual images in the PDF, hence why I am relying on Nitro instead.
Is there any benefit to doing it this way instead of using character count? Any edge cases I’m not considering?