I just purchase DevonThink Pro Office, and imported a number of PDF documents, which are not text searchable. When I ask DevonThink to convert them to a searchable PDF (which seems to work fine), it looks like DeonThink is saving the original and creating a NEW document with the searchable version, i.e. I now have 2 of every document when I look at the database.
- Is this normal?
- How to I delete the dups, i.e. just the docs with no searchable text?
Check ‘Original Document: Move to Trash’ in the OCR prefs.
Thanks Greg - that did the trick for the bunch that I haven’t converted yet. Is there an easy way to identify the dups I’ve already created?
If you make a smart group that looks like this
You’ll find all non-OCRd PDFs. Since that might find documents that you don’t want to delete, then don’t use the “Word Count is” predicate and the smart group will find all PDFs. Sort the smart group results by name and the PDF and PDF+Text pairs will line up next to one another.