PDF Expert 3 OCR performance compared to ABBYY/DNtp

I posted this over at Mac Power Users; however, I thought it might be of interest to DEVONthink users too:

Just upgraded to PDF Expert 3. I’ve been using it on iOS and MacOS for some time. The addition of OCR is welcome. However, while it works well on clear scanned documents, performance with lower quality scans is poor compared with ABBYY FineReader.

Take, for instance, this text. It’s from an academic journal, printed in the 1970s and digitised in the 2000s:

FineReader:

Designed for sixth-formers and first-year students, this is Volume I of 'The Making of the Modem World* edited by the Professor of French History at University College, London, The three volumes - on the voyages of discovery, the colonial period and “the end of Europe” - will be of introductory interest to historians, geographers, anthropologists and internationalists especially since their American text-books tend to be littered with allusions to non-European systems.

PDF Expert:

De8i8ned for 8ixth-former8 aiid fir8t-year 8tudents. th16 L8 Volume I of •The Makins of the Modern World. edited by the Profe68or of French Hi8tory at Unlver8lty College, IA)ndon. The

three volume8 on the voya8e8 of di8coveryg the colonial period and l•the end of Europe will be of introductory intere6t to hlatorian8 • seo8rapher8, anthropolo818ts and internationali8t8 e8peclally 8lnce their American text-book8 tend to be littered with allu8ion8 to non-buropean 8y8tell!.

I also tried the command line application ocrmypdf (which uses the Tesseract OCR engine) and the result was basically gibberish.

I believe that PDF Expert is using the underlying Apple OCR engine that was introduced to the OS recently. Obviously that wasn’t trained or designed for documents such as the above (and probably won’t be any time soon).

In sum: ABBYY FineReader is still unrivalled for document OCR, which is a shame as version 13.0 of their Mac app (which supports Apple Silicon) killed a huge amount of functionality, including all automation capability. PDF Expert is nice but still a long way from being an ABBYY/Adobe killer. It seems likely that more apps will integrate Apple’s OCR capabilities; however, it seems to be a limited tool at this point.

(N.B. I got identical results with the FineReader app and the ABBYY engine as run within DEVONthink. Identical except that, as has been discussed on this forum previously, the ABBYY engine in DNtp increases the file size significantly. I could potentially give up the FineReader app given that I have DNtp except that I sometimes need advanced options, such as splitting facing pages.)

1 Like

Thanks for sharing this test and your results. ABBYY is still making a top-notch OCR engine.

It improves a little bit if you select “Precise” (“Preciso” in my Spanish version) option in Advanced in Options in PDF Expert. At least is fast, but nothing compared with Abbyy.