I have a handful of PDFs that cause the Activity Log to go bump and give me the warning of “No Text”. These PDFs have been OCRed and I can select and copy all of the text correctly.
These PDFs are imported, but their file type is “PDF”, not “PDF+Text”. The content is not indexed, I cannot find those PDFs via search.
As far as I can tell, none of these PDFs are protected. They are scans, but I didn’t create them.
Any ideas on why this happens and how I can prevent it?
Thanks a lot!
Although PDF stands for Portable Document Format, given all the applications across computer platforms that can create PDFs in various ways, it’s not surprising to come across different “flavors” of PDFs that behave in surprising ways.
Sometimes a strange PDF can be transformed to a more ordinary one by doing a small edit (which perhaps can be removed by Undo) under, e.g., Preview, then saving the PDF.
Try opening the PDF under Preview (Shift-Command-O), performing an edit, then save it. If DEVONthink now shows the Kind as PDF+Text, perhaps DEVONthink will be able to recognize it as a searchable PDF and index it.
If that doesn’t work, and if the resolution of the PDF image is high enough for accurate character recognition (and you are using DEVONthink Pro Office), select it and choose Data > Convert > to searchable PDF.
I have tried both your suggestions. When I convert the PDF to a searchable PDF it is recognized as PDF+Text until I change something and save (both in Preview and in DevonThink), then it’s back to PDF (without text).
There seems to be something funny happening…
Could you send an example PDF as an attachment in a ticket to Support, with a link to this forum thread included as explanation?