Phf files that are not read by DT

ronaldkb · April 20, 2009, 11:28pm

I have a good number of old pdf files which date back to 2002. Although I was able to import them to DT, they are not “read” by the software: DT is not able to search inside these files. What is the best way to address this issue? How can I make these files legible to DT?

A related question is the following: I have now some 500 or so articles and files in my DT. I don’t know exactly which ones are “read” by DT and which ones not. How can I isolate the files that are not properly read by DT?

Appreciatively.

Bill_DeVille · April 20, 2009, 11:55pm

Image-only PDFs do not contain a searchable text layer. Such PDFs (if they were scanned at sufficient resolution - 300 dpi or better is recommended) can by converted to searchable PDFs by OCR (optical character recognition).

DEVONthink Pro Office includes an OCR capability and can convert image-only PDFs to searchable PDFs.

DEVONthink displays the Kind of image-only PDFS as ‘PDF’ and of searchable PDFS as ‘PDF+Text’.

You can add (or remove) the Kind column to a view window. For example, if you view the All PDFs smart group, choose View > Columns and check the option for Kind. Now, click on the Kind header to sort all your PDFs by Kind.

Those that are shown as PDF are image-only and are candidates for conversion to searchable PDF (Kind = PDF+Text).

ronaldkb · April 21, 2009, 8:58am

Thank you so much. Very indeed helpful.