clang
September 16, 2022, 10:03pm
3
For anyone coming across this thread while searching for a solution (as I did), I have written a script that helps work around this quirk of the OCR component:
If you use DEVONthink’s OCR feature on a PDF that has an existing Table of Contents, you have probably noticed that the TOC is missing in the processed file. It appears this is a bug in the underlying ABBYY FIneReader OCR engine that DEVONthink uses:
https://discourse.devontechnologies.com/t/question-about-dt-pro-ocr/68027/2
I’ve been working around this by using other PDF software to perform OCR on any PDFs that had a TOC I wanted to preserve. But I just haven’t gotten the same quality OCR re…