Struggling with search by "Name"

Hi,

I’ve read a few posts discussing the subtleties of whether the extension is part of the name or not. I’ve not been able to figure out what is creating a large amount of variability in my database. This has come to a head because I was trying to search for files with names ending with something. The database indexes a few external folders.

For a particular population, I’ve decided to focus on my OmniGraffle documents. Searching for filename:>graffle gives me the full result of 12 documents. Searching for name:>graffle I only get 5. Those missing 7 can be found by name:>xxx, where xxx is the last part of the name before the extension.

I’ve not been able to reproduce the extension being included in the name with a test database. I’ve tried indexing an existing folder by dragging it into DT, using the “Index Files and Folders…” menu item, or creating the document within DT using a template. In all case the extension wasn’t included in the name.

How can I explore this further? Is there a way to see the name of a document’s record and notice something different between the two kinds of documents? What might cause the extension to be included in the name?

I suggest that you post some screenshots of the results in DT and of a corresponding search in Finder. Also, what are your DT and Finder preferences re “Show file extension”?

Name and Filename are separate attributes and not necessarily the same. I recommend you use Name.
Also, if you’re looking for .graffle files, use Extension==.

Let’s forget about the “.graffle” example and the use of Filename in a search. I was only using those to isolate a certain population of files and to use that population to demonstrate the level of variability in the “name” property.

I’m interested in searching for entries whose names end in something. Unfortunately it often doesn’t work since some entries have names that end with their extension and some entries don’t.

I have 128 markdown documents in a database. I want to find all documents ending in “xyz” (before the extension). The only way I can think to do it is to find “name:>xyz.md OR name:>xyz”. That seems like it would get tiresome and that I’d sometimes forget to do that and miss things. I’d like to get the data cleaned up.

Is there any way to inspect the entry in DT to determine what it thinks its name is? Is the “name” a property of the document that was indexed or is it a property of the DT database item which references it?

I have DT set to show extensions. Altering that setting does not change the results I get when searching.

Finder responds to a search like “Name end with xyz” by including the extension in that. So Finder’s search by “Name” seems the same as DT’s search by “Filename”.

I’ve taken two documents “GtCalc.graffle” and “ConvMem.graffle”. The first one’s name ends in “Calc” and the second ends in “graffle”. I copied those files on disk and imported them into a new database. I also indexed them in place using a couple of different methods. In the new database, both files’ names do NOT end in “graffle”; one ends with “Mem” and one “Calc”.

So, it seems likely the “name” concept we’re discussing is purely a DT metadata one rather than one that relates to the file itself. I see that I can get at the name using Apple Script “name of content record”. I did not use Apple Script at all on the database up until just now, trying to find out how to access the name property.

I have a feeling there is no known answer to what caused a whole bunch of files to have names ending with the extension. I’m also guessing that the only way to fix the database is with Apple Script. I see that there is a property “name without extension” that I can assign to name. Is this all a correct appraisal of the situation?

I did a bit of experimentation and noticed a way that causes the extension to be part of the name. If I move a file into DT via the global inbox and then relocate it from there (using the DT interface), the file’s extension will be part of the name. If I drag the file directly into a database, then that doesn’t happen. I haven’t done a thorough investigation. This seems like a bug.

That explains the variability I’m seeing, since it’s quite variable whether or not I use the global inbox when introducing new content to my database.

In fact, moving any item from one database to a different one will introduce the extension into the name. It’s pretty clear why the results I see in my databases are all over the show with regard to whether the extension is in the name or not.

Seems like a significant bug unless no one is really using the name property.

I think many people use it (probably most?), but I don’t know how many search for “name ends with”. I never really do that.

How old are the databases where you see this inconsistency?

I don’t know the explanation for the variability you experience, but as you note the scripting dictionary includes the record property name without extension. (It is read-only, however. Unless that changed in DT4. I’m not sure what you mean with “assigning” it.)

As a workaround to a normal search, you can use name without extension in a script like this:

-- Search for "name (without extension) ends with",
-- using `whose` to filter database contents.
-- (Note that you can't use search operators)

tell application id "DNtp"
	set theDatabase to current database
	set theQuery to text returned of (display dialog ¬
		"Name (without extension) ends with:" default answer ¬
		"" buttons {"Cancel", "Continue"} ¬
		default button "Continue" with title ¬
		"Search in \"" & name of theDatabase & "\"")
	
	set theResults to (contents of theDatabase whose name without extension ends with theQuery)
	
	set totalCount to (count of (contents of theDatabase)) as string
	set hitCount to (count of (a reference to theResults)) as string
	log message hitCount & " hits for 'Ends with: \"" & theQuery & "\"' in " & name of theDatabase info "Out of " & totalCount & " files"
	
	set search results of viewer window 1 to theResults
	
end tell
1 Like

It does not. And IMO it would also not make sense to change that to r/w.

At least according to the documentation, searching for name should not consider the extension:

Name: The name of an item. For documents, this is distinct from the filename and does not include the file extension.

I would be assigning the “name” property to the value of “name without extension”. That would be fix I would use if I were to write a script to fix things. I’m not going to bother since I tend to move files between databases; I’ll be reintroducing the inconsistency often. At this point, it’s safer to just understand the issue and work around it.

It’s important to note that this inconsistency in the name affects all searches, not just “ends with”. Searches with wildcards, “is”, “contains”, etc are all impacted. Of course a name criterion could be part of a larger search involving other criteria. It’s a significant bug if you don’t know it exists since your searches by name are not entirely trustworthy.

The filename property is not as powerful as name since it doesn’t support a “matches” search. That’s something I just discovered as I was avoiding the use of name this morning and wasn’t finding something.

You could create two brand new databases and replicate the problem trivially (I just did). I created an empty file on disk named “blah.txt”. I dragged it into a new database named “a”. It was found with “name:blah” but not with “name:blah.txt”. Then, in DT, I dragged the entry from database “a” into a new database named “b”. After that it was found with “name:blah.txt”, but not with “name:blah”.

This is the first report I have ever seen on this issue.
Development will have to assess it.

This seems to be caused by the option to show filename extensions, we’ll check this.

And the next release is going to fix it.

1 Like

Thanks for that. I appreciate such a quick resolution.

I’ve done very little work in AppleScript. Thanks to the phenomenal DT manual (I can never say this enough), great documentation in the DEVONthink dictionary, it seems to be an easy ramp up. I wrote this script to apply a fix to selected documents.

tell application id "DNtp"
	repeat with rec in selected records
		set nm to name of rec
		set nmNoEx to name without extension of rec
		log nm & " " & nmNoEx
		set name of rec to nmNoEx
	end repeat
end tell

Does that seem OK? I know nothing about AppleScript best practices (especially in regard to handling errors) and I don’t know the object model fully. The script seems to work fine when tested on small collections of selected records.

2 Likes

Much appreciated :heart::smiling_face:

I didn’t try to reproduce anything with new databases, but I realized that I also have records that include the file extension in their name.
(A rough estimate is 10-20%. I have “Show filename extensions” enabled in Settings > Appearance and have kept it like that for a long time.)

Since I never noticed, I guess this hasn’t been much of an issue for me. While it does in theory affect all searches, I imagine it mostly becomes a problem for name ends with queries? Or maybe we use search in different ways.

Setting a record’s name to it’s name without extension using AppleScript does indeed seem to fix it. Much preferable to my “alternative search” script above.

Your example looks fine to me! It’s kinda crazy how scriptable DEVONthink is, so I encourage you to explore.

If you’re new to AppleScript, one of the things you’ll want to learn about is the whose clause – an extremely efficient way to get application objects matching a certain condition. (Or multiple, you can make pretty complex queries if you want to.)
The offical AppleScript Langauge Guide defines it as the “filter reference form” of constructing an “object specifier”.

Using whose, we can quickly find all records affected by this issue:

tell application id "DNtp"
	set badNames to (contents of every database whose name ≠ name without extension)
end tell

And fix them like this:

tell application id "DNtp"
    repeat with theDatabase in databases
        set badNames to (contents of theDatabase whose name ≠ name without extension)
            repeat with theRecord in badNames
                set theRecord's name to (theRecord's name without extension)
        end repeat
    end repeat
end tell

Now, adding some log messages:

-- Fix record names that include their file extension in all open databases

tell application id "DNtp"
    repeat with theDatabase in databases
        set badNames to (contents of theDatabase whose name ≠ name without extension)
        set badCount to (count of badNames)
        if badCount = 0 then
            log message ¬
                "0 record names that include their file extension." info (name of theDatabase)
        else
            repeat with theRecord in badNames
                set theRecord's name to (theRecord's name without extension)
            end repeat
            log message "Fixed " & badCount & ¬
                " record names that included their file extension." info (name of theDatabase)
        end if
    end repeat
end tell

PS: You can also use JavaScript to work with DEVONthink, by way of Apple’s JavaScript for Automation (JXA). It interfaces with the same underlying scripting architecture, though the implementation is not 1:1 with AppleScript… and it has some bugs. The two have different strengths and weaknesses. You might prefer JS if you already know it or have other programming experience.

I took a little while to get back to this thread because I tried to write the above in JXA and couldn’t make it work. Must be a bug. The JXA implementation of whose doesn’t seem to work for comparing two properties of an object.

1 Like

Thanks for the example scripts.

I’ve tons of experience with JavaScript, but knowing JXA is buggy or incomplete dissuades me a bit from going that direction. If I have something to automate I’d probably just use AppleScript.

It’s a shame. As a 35-year professional software developer, I find normal software language styles much easier than AppleScript’s use of natural language.

For the usual tasks, JXA works well enough. The deficits are mostly apparent in the ObjC bindings. And @troejgaard found this issue with whose, which can be easily replaced by filter.