Hi all, i have indexed hundreds of notes in text files. Usually I have used the question mark (?) in those as shorthand if I wasn´t sure about something. Now I wanted to find all notes that contain the ? in order to revisit those notes and rethink them. But it seems I can´t search for ? as ? is a wildcard .. Even “?” doesn´t work…Are there any workarounds ?
Only alphanumeric characters are indexed, therefore the only workaround would be to use a script but this might be really slow:
tell application id "DNtp"
set theResults to contents in current database whose plain text contains "?"
set search results of main window 1 to theResults
end tell
The same problem is with rtf files created in DT4. No ? found.
Anyway thanks for the script. Works all right.
The same is true for any file.
I don’t know if “problem” is the right word… It’s a consequence of balancing multiple competing factors. If DEVONthink indexed every possible character, I assume the standard search wouldn’t work as well as it does—and it wouldn’t have the same amazing speed.
Why not use Terminal?
- Open your DEVONthink database in Finder (right-click database → “Show Package Contents”)
- Navigate to the Files.noindex folder within the .dtBase2 package
- Copy the directory path
- Search using (md for Markdown or txt for text files):
grep -r -l "\?" /your-path/database.dtBase2/Files.noindex/md
The -l
flag shows only filenames, not the content.
This finds all files containing question marks
Edit: 923 files, 258 hits, 1 second
Cool, this is new to me. I tested just for curiosity’s sake. Forgot to check the size of my current database Hello beachball… (My current machine is not all that powerfull). Waited 5 minutes and decided to force quit.
Instead of using contents in current database
, this approach let me limit the scope to the current group:
-- Search characters not included in search index
-- Scope limited to current group
tell application id "DNtp"
set theGroup to current group
set theDocs to search "kind:document" in theGroup
set theResults to {}
repeat with theContent in theDocs
if plain text of theContent contains "?" then
set thisID to id of theContent
set thisRecord to get record with id thisID
copy thisRecord to the end of theResults
end if
end repeat
set search results of think window 1 to theResults
end tell
(since I’m still using DT3, I used think window
instead of main window
)
I tried with a group of 1500 plain text files. It took just under 30 seconds to run. (363 hits)
My first thought was also to just use another search tool.
Not everyone is comfortable using the terminal though. A normal spotlight search in a Finder window also works (and is pretty fast). Since these are indexed files, it would be easy to open the parent folder in Finder and search there.
But if you want to manipulate the files in DEVONthink—replicate them, set a label etc.— it’s good to know a way to display the list of results there, even if it’s slower.
I like your AppleScript solution and have already saved it for future use. While I can quickly draft a terminal command, having the solution work directly within DEVONthink is more elegant.
Another option is EasyFind.
@papierlos Here’s a more elaborate version, if you see yourself doing this regularly:
-- Search characters not included in search index
-- Scope limited to current group
tell application id "DNtp"
-- Scope & Query
set theGroup to current group
set theQuery to text returned of (display dialog "Search string:" default answer "" buttons {"Cancel", "Continue"} default button "Continue" with title "Scope: \"" & name of theGroup & "\"")
set theDocs to search "kind:document" in theGroup
-- Info strings for log/messages
set scopeCount to (count of theDocs) as string
set scopePath to (name of database of theGroup & location of theGroup & name of theGroup)
-- For large scopes, confirm before starting search
if (count of theDocs) > 1500 then
display alert scopeCount & " files to search. This might take a while..." buttons {"Cancel", "Continue"} cancel button "Cancel"
end if
-- Search
set theResults to {}
repeat with theContent in theDocs
if plain text of theContent contains theQuery then
set thisID to id of theContent
set thisRecord to get record with id thisID
copy thisRecord to the end of theResults
end if
end repeat
-- Log info & Display results
set hitCount to (count of theResults) as string
log message hitCount & " hits for \"" & theQuery & "\" in " & name of theGroup info scopeCount & " files — " & scopePath
set search results of think window 1 to theResults
end tell
@cgrunenberg EasyFind would be my next suggestion
@troejgaard Thank you, I saved the script as “search the wild”
Note we do not advocate messing about in the internals of a DEVONthink database like this.
You can select a database in the Finder and press Option-Command-C to copy the POSIX path. Then use this form in Terminal…
grep -RilE 'test*' "$(pbpaste)"/Files.noindex/md *.md
Out of curiosity, what does DT consider to be an alphanumeric character? Is it the same as what Unicode considers to be an alphanumeric character?
Note we do not advocate messing …
Noted!
You can shorten the command to:
grep -RilE 'test.*' "$(pbpaste)"/Files.noindex/md
No need to add *.md as the folder only contains Markdown files.
Include the dot (.) in the regex if you want to match “test” followed by anything. ‘test*’ would match things like tes, testt, testtt, etc. (multiple ts after tes).
Yes, it‘s the same.
You could try Houdah Spot. Search in name (or text content) for ? Don’t need quotation marks. Limit your search to a folder or a drive. I tried it with about 30000 files; found 839 with title containing ?
Don