OCR fails from scan import in Catalina

As I mentioned, the PDF is indeed listed as “PDF+Text”.

The weird bit is that I can find text on the page, but just no highlighting when found, and more importantly, no selecting!

This is quite important for me as I regularly scan in lots of printed documents for OCRd PDFs.

Ahh… Sorry, I missed that.

I’m curious: If you run OCR on the file again, does it behave the same or differently?

Aha! Well I got the warning dialog, "Are you sure you want to convert this searchable PDF again … ", but I clicked the ‘Convert’ button anyway, and on the newly created file, the text is selectable.

Hooray. But boo also, as I don’t really want to do it twice on every scan. But thanks for the suggestion as a workaround. It seems to work.

Trying that “re-converting” workaround again using 300dpi (I usually use 150), I thought that the app had hanged (hung?) on the process. It took a long time - 10 minutes or more - and the DTOCRHelper process swallowed 15GB of RAM. (I only tried 300dpi thinking it might help. It didn’t.)

This should be resolved in the next maintenance release.

Great. I’ll look forward to that fix. Thanks.

No problem.

I’ve encountered the same issue with a ScanSnap ix500, MacOS 10.15.6, and DEVONthink 3.5.1. Until the maintenance release is available, I been scanning to disk with ScanSnap Manager doing the OCR, then importing into DT.

Yes, I’ve had to rely on other software as well, as the “converting twice” method can be hit and miss. Hardly “no problem” really. I’ll be glad when I can scan documents straight into DT3 again, using DT3.

The problem is still not fixed in the latest 3.5.2 update !

The ABBYY download happened, but scanned documents, OCRd, produce no selectable text!

Furthermore, as has happened before, if I opt for the scan to go to a “new binder”, the scan process completely forgets about that as soon as I click scan.

What kind of scanner are you using?

Epson Perfection 4990.

FineReader has no problems OCR-ing it, but with DT3 I always have to scan twice!

(that is, not “scan twice”, but after the unsuccessful OCR, go for the OCR > to searchable PDF)

Hmm… I’m doing a scan with an HP OfficeJet 9010 in DEVONthink’s Import sidebar, with OCR enabled.

The text is fully selectable in the finished file.

Outside of manually reinstalling the OCR components, @aedwards would have to comment on this further.

Well, it isn’t for me. Yes, import sidebar, OCR enabled - of course. Considering that Finereader successfully OCRs documents, there’s no reason to think that it’s anything other than DT3 at fault (and especially since we saw this identically behaving bug in the previous release version).

Looking around similar threads I notice that re-installing the ABBYY DTOCRHelper application seems to help. I find I have an older version 1.1.2 (as opposed to the version 1.1.13 installed today with the DT3 3.5.2 update). Should I try replacing the newer version with the old? It seems a bit of a kludge.

As I said previously,

Outside of manually reinstalling the OCR components, @aedwards would have to comment on this further.

Uh … OK …


Hahaha! I don’t understand your message. :slight_smile:

I’m waiting with trepidation for the pronouncement from @aedwards

If it’s working with the version you have currently installed, you can proceed until he has a chance to chime in.