Complex PDF with multi-columnar text and images: DevonThink OCR outperforms all other OCR apps including the best rated

The basic problem is that of the OCR of a searchable text in not adequate, it’s quasi impossible to annotate it.
I had a non searchable complex PDF with text in columns and images which I could not convert to a decent looking readable PDF. I tried the top rated ABBYY FineReader PDF app and many other apps including OWL OCR, pdf expert, pdf pen, fineprint and others. In all cases, the OCR was incapable of discerning individual columns of text on all pages. The quality of the results varied with the apps.
With DevonThink’s OCR, the conversion was perfect including the columnar structure of the text.

Isn’t DT using Abbyy‘s engine internally?

1 Like

I was wondering what type of OCR DevonThink is using.
Concerning Abbyy, I tried the following

  • OCR with Fujiscan’s ABBYY app → the app refuses to OCR because it only functions when a hardcopy is scanned.
  • I downloaded from the app store ABBYY FineReader PDF.app, one of the best rated apps with an expensive subscription → the OCR on my text was suboptimal in terms of columns
  • before trying with DevonThink, I even tried to print the text → scan with Fuji Scanner → convert to readable with the scanner’s ABBYY
    Only DevonThink gave good results in the end.

Question : I thought that DevonThink had a “import as searchable PDF” menu item but I can’t find it. Would you know about it ? thank you

AFAIK there’s no such menu. But there’s a smart rule workflow which has been discussed here often.

It does, see About dialog.

Maybe you’re looking for File > Import > Images (with OCR)…?

1 Like

Yes, thank you @cgrunenberg