Making PDF's searchable after they are scanned

Hi

Scansnap does excellent text recognition, however, it takes too long if you have a lot of documents. Is there a way that I scan files into PDF and pdf+ (Searchable text) them afterwards?

I managed to get a list of all non searchable PDF’s iN Devon Think, but if I try Data->Convert I can’t find a PDF+ button or Make Searchable

What am I doing wrong?

What edition of DEVONthink do you use? Conversion to searchable PDF is a feature of only DEVONTHINK Pro Office.

That bugs me even more, got the PRO (office)

The forum’s not going to be able to help diagnose a problem with specific documents … you’re better off sending a trouble report to support-at-devontechnologies.com or here and including samples of the PDFs that DEVONthink is not recognizing as convertible. That way the tech staff can see the actual issue and deal with it.

The question for now is: is it in Office Pro? Should it be in Data->Convert?

In DEVONthink Pro Office (only) when a PDF that can be OCRd is selected, then Data > Convert and the contextual menu should show:

Having recently upgraded to DTP Office partly for this very facility to convert PDFs to searchable PDFs I am very interested in this. Four questions:

  1. Do I have to go through all of my PDFs and convert them one by one?
  2. If so can I do this from the ‘All PDF Documents’ Group?
  3. How do I know which of the resulting PDFs is the one that has been converted to searchable?
  4. Can I then delete the original, non-searchable PDF and if so will deleting it from the ‘All PDF Documents’ Group delete it from its Group?

Thanks for your patients & help.
Andrew

“…patients” ??? Sorry meant patience!!! :open_mouth:

If a PDF has been OCR’d previously, it will show up as “PDT+Text” and no further action is needed. (In fact, converting/OCRing an already-OCRd PDF makes things worse, not better)

Yes, or you can select a group of PDFs and have them all converted in a batch (one-by-one, not simultaneously) I keep a Smart Group defined as follows in my databases so I know at a glance what’s not been converted:

In the Kind column of document displays it says “PDF+Text”. So does Tools > Show Info

Sure. You can delete the original automatically if you go to DEVONthink > Preferences > OCR and select “Original Document: Move to Trash”

See DEVONthink Help for further instructions on the preference settings…

Thanks Korm as usual. :smiley: