I am applying OCR to all my PDF documents, one-by-one. Tonight, after a reboot, DT3 has decided that it can’t load the PDF files for OCR, giving me the error “Unable to load PDF document.” Preview, Finereader OCR Pro, and PDFPen Pro have no problem loading, modifying, and saving the document. How do I convince DT3 to start working again?
Your post is a combination of singular and plural: is this affecting one or several documents? Are they (is it) imported or indexed? Especially if the latter, does DT have Full Disk Access? (I ask because for me Monterey is badly buggy, showing lags, spontaneous changes in the Findet settings, etc.; who knows what else changes without user intervention?)
Thank you for your reply,
I have multiple documents. I OCR each one separately, thusly:
open document
adjust title and meta-data to suit
perform OCR to get a searchable PDF
back to step one.
This is done manually.
DT3 has full disk access and the above procedure was working fine until tonight. DT3 OCR would occasionally fail to save the document, but a repeat of the OCR step would successfully convert and save it.
I’ve experienced the lags also, particularly in spotlight. I have no idea what Findnet is.
when you say you open the document, what exactly do you do?
how do you perform OCR (i.e. what do you select from where)?
(I do understand that the process worked until the previous day; obviously that makes it unlikely that you are doing anything “wrong”; I’m just trying to understand the steps to the point I can perform them on my test database to see what happens).
the papers are indexed, and the masters are stored within my Documents folders on my hard drive. (They are backed up via TM and are also in ICloud)
I have a smart rule set up that scans that folder “Technical” and looks for PDF/PS documents with a word count of zero. I select the a paper that matches the rule, edit it’s title and meta-data before performing the OCR.
I originally did a right click and choose OCR->to searchable PDF. I have since set up a keyboard shortcut for that via system preferences. The problem occurred both ways.
are the documents actually really still on your local drive or have they been offloaded to iCloud by macOS (ie do you have storage optimisation on?)
does OCR work on imported documents?
try giving the OCR helper app full disk access (the app is in the library, application support, DEVONthink, ABBYY folder - I’m not at my desktop, that path is from memory)
The batch problem was intitiated in an earlier thread. DT support identified it as an ABBYY internal problem and sent it off to them. Then I started doing one-at-a-time which usually works. this Thread was about a glitch in the second process.
A batch work-around for those who have PDFPen Pro:
– using indexed files and a DT3 smart rule to find PDF documents without OCR.
– use DT3 to group, clean up, and add meta-data
– Drag & Drop a stack of files onto the PDFPen Pro OCR Files window. (File->OCR Files). Perform OCR.
– DT3 sees the changes and updates the smart rule.