Smart Group where indexed Kind is not PDF+Text

Is there a way to setup a Smart Group for PDF files (indexed) where the Kind is not PDF+Text?

Basically to locate files to then optionally convert to searchable PDFs.

You can set up a smart group to search for kind is PDF/PS, then sort the results by kind. The list will be grouped by PDF and PDF + text.

Another approach (anything with no words is not OCRd).

#Greg_Jones: Thanks. I’ve done sorting by Kind previously to find these no-text PDFs. I was looking for something a little more elegant.

#korm: Great tip that works well. I added Size>150KB to get rid of tiny nuisance graphics, but a way to eliminate Kind “eps” would be even better.

Thank you both.

Greg’s suggestion is actually more elegant and effective than adding the Size criterion as >150 KB, which has nothing to do with separation of image-only from searchable PDFs. I often get scanner output files that are less than 150 KB in size, and require OCR.

If you meant Word Count rather than Size, the threshold of 0 works well. No searchable PDFs will be displayed in Greg’s smart group.

There’s no Kind = eps definition in DEVONthink. If you have graphic images as PDF, they will be included in Greg’s or korm’s method. Worst case, if you select all image-only PDFs for OCR, some computer resources will be wasted and – depending on resolution – there may be some degradation of the graphic images. If you have used a naming convention or tag to distinguish graphic images, they could be excluded from the smart group by an appropriate filter criterion.