Smart Group (name contains?)

I want to create a Smart Group for documents that need OCR. I can get the PDFs with no text easily, but I’ve been keeping the originals with (non-OCR) appended to the title. However, the Smart Group filter for name only has “is” as an option. I hope that “contains” is an option in the near future. This would allow me to exclude the non-OCR versions.

All fields match search terms including operators, proximity, phrases or parenthesis. In this case just enter “non OCR” (with quotes).

Note that you could instead have sorted by Kind, or used a smart group for Kind = PDF.

For image-only PDFs, Kind = PDF.

For searchable PDFs, Kind = PDF+Text.

However, in the Smart Group section, I can’t have a Name does not contain “non OCR” because in the Smart Group filters, there is only matches as an option for Name.

I am using a Smart Group. What I want to do is refine that Smart Group to exclude files based upon something contained in the name.

Take Alexander’s Gordian Knot approach. Base the smart group on “non OCR” in the Name. That excludes your searchable PDFs.

But you didn’t need to change the Name in the first place, had you sorted by or created a smart group based on Kind. Note that having changed the Names, you will next need to identify those PDFs that are now searchable and then remove your “non OCR” tag.

The problem is that I keep both versions of the PDF in the library. When I OCR the file, I keep the original and change the name. So, the OCR version isn’t in my Smart Group, but the non-OCR version is. However, I’ve already run it through OCR, so I need a way to easily identify it on sight, hence the name change.

My Smart Group is Kind =PDF and Words = 0. Now, I want to eliminate ones where I’ve added the non-OCR to the title. So, I want a “does not contain” for the name to eliminate the ones where I’ve changed the title.

I probably wouldn’t have started with using a smart group to distinguish between the image-only and searchable PDFs. I sometimes keep both versions, but I’ve included a Kind column in the view window, so I can always tell which is which, even though they have the same name.

But smart groups can be very useful. I usually start with the powerful query operators and syntax in DEVONthink’s Search window.

Let’s start with a query term that will find every item in a particular database. Most of the Names contain alpha characters. But I’ve also got some ScanSnap scans that I haven’t yet renamed, and they all start with “2009-”.

Here’s a query that will find every item, group or document, in that particular database: [a-z]* OR 2009 when used in a Name search.

But I’ve named some of those files by adding “copy”, and suppose I want to filter out all files that have “copy” in their Name. So I’ll add to the query a way to specify that the results should exclude Names that contain “copy” (just as you wish to exclude those that contain “non OCR”.

The query becomes ([a-z]* OR 2009) NOT copy

OK. I’ve got a search formulation that excludes some names. I’ll create a smart group by conducting a search in that query on my specified database.

To turn that into a smart group, I’ll click on the “+” symbol to the right of the Loupe symbol and save the new smart group to my database.

But I still want to do some more filtering in that smart group. I’ll Control-click on it and choose Edit.

Next, I’ll change conditions from Any to All. Then I’ll add a Kind specification that the Kind must be PDF or PS.

If I wish to further filter so that the smart group contains only PDFs with zero Word count, here’s what the Edit screenshot will look like:

The smart group correctly displays the only three PDFs in that database that meet the specifications.
Smart group filtering by Name-Kind-Words.jpg

Thanks for that Bill. I learned something very useful from that and I managed to get it to work after some trial and error. I did have one problem in that the + is greyed out by default when the window is brought up. It isn’t clear how to get it to become active. Maybe a tool tip stating to change the databases list?

Now that I know how the matches works, I find it very useful. It just isn’t clear when one is used to groups like in iTunes.

Is this searching behaviour possible for Comment, Tags, Urls etc?

Or is it possible to implement more searching options like for example:
comment contains/does not contains.
tags contains/does not contains.

Sure (except tags which is not yet available).