There are posts addressing this feature and an older template from DTPO 2 that allowed for searchable PDF criteria, but that template appears to have gone the way of version 2. In DT3, I have tried to create a rule with Kind and both PDF/PS and Text, but that filters out all PDF+Text. Is there a way to assign ‘PDF+Text’ in order to use that as a rule to assign a tag ‘ocrDone’ and move the file to a smart group whose kind is PDF+Text?
What will work as of DT 3.5.1?
Thanks for the answer on identifying a searchable pdf. I stumbled across that feature after your reply in Toolbar -> Actions -> New from Template -> Smart Groups -> PDFs (not searchable).
Part 2: is the kind property assignable or constant? It appears that that property only changes as a result of OCR. Is there another way to modify the kind property?
No the Kind is not user-definable.
I’ve just had the same problem. Came up with the same idea about >0 words myself, but: This is unfortunately not working if you’ve lots of scans that are basically plain images, in PDF.
The reason I am looking for a “PDF+Text” kind search is that I want to have a smart rule for running OCR automatically. At the moment, I can’t, so those PDFs are re-OCRd each time.
So you have images that do not contain any text but are in PDF format, right? In that case: Why even bother with PDF? It will most certainly take up more space than any image format (like jpeg, png or tiff), and since you don’t have text…
Why bother… Because those are already there…
So anyway. I’ve created a workaround giving those a tag “Don’t OCR” and have the rule take that into consideration.