Hi there
Apologies ahead of time if I sound a bit grumpy, but I’ve spent a good few hours today trying to get the following to work. (It’s something I’ve been trying to accomplish, periodically, for the several months.)
[Devonthink Pro Office 1.5.3 on a MacBook Pro Intel Core Duo running Leopard 10.5.4]
I’ve also checked to the forums to see if this has been addressed but my searches only seem to throw up irrelevant articles on OCR in between errors on the forum page reading “Sorry but you cannot use search at this time. Please try again in a few minutes.”
OK.
Here’s what I’m trying to do: I’m writing a novel (actually, I’m just about to complete it) set in an historical period. There is an absolutely key text that I’ve scanned using my Epson V500 Perfection scanner. Basically, I want to have this book in my DevonThink database as a searchable PDF.
Try as I might, I just can’t do it.
It’s difficult to be precise in describing the usage scenario that leads to the problems. Basically, when trying to scan either (i) the multi-page PDF (which is very large at 500meg, but conforms to the OCR recommendation of 600dpi, colour, etc.) or (ii) individual pages scanned as JPEGs and TIFs.
The errors I get are many and various. When trying to OCR the multi-page PDF, I get a dialogue reading “Opening [file]” and nothing happens, even after several hours. It’s important to be able to have an OCR’d multi-page PDF because the alternative - individual pages - mean that I can’t move forward and backwards through the book from the point where my DT finds a match.
With regards the errors I get when scanning individual pages (my plan B), the OCR engine either stops, maxing the CPU, or gives me a error like “OCR couldn’t do the OCRing - do you want to skip or continue?”, or, rarely, it works fine. Occasionally, DT itself crashes.
It’s the non-reliable errors that are most frustrating. The OCR engine seems to be behaving almost randomly. I’ve varied everything I can think of: DPI, file format, the OCR settings in DT itself. Nothing seems to provide a stable scenario in which I can get this bloody book OCR’d.
Any ideas? I’m pretty sure it’s not my Mac, which is perfectly stable in all other aspects. The ABBYY FineReader OCR software I got with the scanner (which runs in Rosetta) works perfectly and about 4x as fast as the OCR engine in DT - the problem with FR is that it’s a lite version that requires several clicks per image, and won’t do multi-page files.
Any advice appreciated. This is driving me up the wall.
Best
Ian
Novelist in the UK