When I try to create a searchable PDF I always receive the message (source is a scanned PDF)
13.10.22, 21:50:00: Der OCR-Vorgang ist fehlgeschlagen. |
Erzeugung der PDF-Datei fehlgeschlagen. |
13.10.22, 21:50:00: Erzeugung der PDF-Datei fehlgeschlagen. |
|
Problem exists on different machines (old installation, working since month without changes as well as on new machine with new installation).
Databases can be read and also written to. Abby fine reader is installed on all machines.
Are there any hints to solve that issue?
I would like to add: if I start OCR - it runs through the document - can see this in the activity window. But when starting storing PDF - it hangs.
Does this happen with all PDFs (even single page ones) or only with a particular one? In the latter case, I guess you should open a support ticket.
Also, it would be helpful to describe in which way you “try to create a searchable PDF”. Smart rule, context menu, any other steps?
It happens to single as well as to multi-page PDF files.
First, the problem raised by using smart rules.
But when investigating the problem, the same occurs with specific requests of a specific file to convert into “searchable PDF”.
Have tested as next if this has any correlation to network resources (I use as source for INBOX a NAS drive.) .
So when I OCR to searchable PDF and the file is on NAS drive: OCR runs through pages and executes OCR. When saving file the process hangs and the saving file activity is shown until process is cancelled manually.
When I OCR the same document from within the database (manually copied form NAS to INBOX of database): the OCR runs through and the process is not “hanging”. But there is also no searchable PDF generated.
This occurs on the same manner in different databases.
How is the drive connected/mounted?
with SMB. But see my recent post: it also occurs with local files
One first finding in testing:
When I first convert the PDF from the scanner into paginated PDF and then in a second step OCR the document it works.
The scanner was not changed or updated.
What kind of Mac (Apple or Intel chip) and scanner do you use?
One Apple MacBookPro M1 and one MacBookAir M2.
Scanner is Brother ADS2800W.
Was working fine for more than one year … but some update seems to jeopardize the process.
When I convert the PDF from Scanner as a single additional step into a paginated PDF then the OCR works.
After conversion to paginated PDF the difference is a small Logo looking like a squared smiley. What is the meaning of this attribute?
A screenshot of the icon would be useful, thanks!
Cf “iconography” in the manual.
A squared smiley is a miniature Finder icon.
This property icon indicates an indexed file.
Are you using the Brother scanner software to generate the PDF? If you are, try using either the scanner software in DEVONthink or Apples Image Capture. Does the PDF scanned with either of these now OCR ok?
Hello Folks, I’m new here. I have exactly the same Problem. Brother ADS 2800W scans via SMB to Synology Diskstation, files are synced with Synology Drive to a local folder on my MB Pro M1 Macbook and Devonthink monitors this folder. Files are not longer OCRd since a few days. I think the problem came up when my DT reloaded the latest ABBYY Finereader Plugin.
The Protcol shows “Creation of PDF file failed” (translated from german language, so the message may vary in english)
Does it work using the Image Capture application or scanning via DEVONthink?
Welcome @clearsky
This is an issue with the output from the Brother scanner not conforming to the date expected by the OCR engine.
Try opening and re-saving the scanned file in Preview and try the OCR again.
The brother scans to a folder on a synology NAS, which is synced to my mac. The import to devonthink runs via apple Script and Folder monitoring. Worked fine for over 2 years.
according to the protocol the OCR works fine. The problem comes up when DT tries to save the file.