Advanced search in individual documents

Hi.

I’ve just purchased DEVONthink Pro. I’ve successfully searched databases using boolean connectors, which works well. However, the same advanced searching does not appear available when searching individual documents in the inspector pane. Is this right? The individual files seem to be automatically searched with a simpler version of my search: e.g., I will search ‘strong NEAR/10 coffee’, finding any document with strong within 10 words of coffee, but it leads to the individual files being searched with ‘strong coffee’, finding some results where coffee might be more than 10 words away from strong.

Any tips appreciated!

Thanks.

That’s correct but will be supported soon.

2 Likes

Has this capacity for proximity search operators been incorporated in the latest update, if not, when?

I also often reach an impasse using wildcard symbols and proximity searching similar to @TSJ .

For example, when I search, across my databases, the phrase ‘by NEAR/2 I mean’ , DEVONthink Pro seems able to generate a list of files in the search result, BUT when searching within the individual files themselves, I run into problems (and it doesn’t seem to be a problem about the integrity of the OCR of the file).

When a proximity search does manage to work within an individual file, the word order of the search is not respected. For example:

I tried enclosing the search phrase in quotations to force the sequence order. But doing so results in zero search results. For example:

Am I missing something or doing something in error?

Also, anything I should keep in mind in general about using double or single quotation marks in my searches? And, with respect to proximity search operators, whether there are distinctions I should consider regarding the differences between, say, NEAR/5 v N5 or WITHIN/2 v W2 etc.

Thanks for any tips!

The NEAR operator doesn’t define an order actually but BEFORE or AFTER do.

Quoting defines the order of the words and disables operators (e.g. useful to search for words identical to operators like “AND”). See chapter Search operators in appendix of help.

3 Likes

Thanks for this. I tried BEFORE and AFTER as proximity operators in my search. But the search results are still too rangy. Is it because “by BEFORE/2 i mean” contains very small and therefore negligible words (“by” and “i”, e.g.)?

An ideal search result would come up with instances like “By congressional precedence I mean” or “by figural realism I mean.”

What’s a work around?

So would that not be achieved by searching by BEFORE/3 “I mean”? I’m not sure about whether to use /2 or /3 - I could explain either logic and don’t know off hand which one DT subscribes to.

Answering myself: yes, using "I mean" works, and it’s /3 if there are 2 words between the initial search term and the secondary combined search term.

Continuing my post, through: the results are not all as I would expect; searching word1 NEAR/5 "word2 word3" in my database shows me 4 items; for one, the words word1 and word2 word3 are appropriately highlighted (yellow and green respectively); for one only the word word1 is highlighted (yellow); two documents contain no occurrences within 5 words of one another at all.

1 Like

Thank you @Blanc . This helps clarify the matter.

However, I’m still running into problems. @cgrunenberg : how to explain the inconsistency between database-level search results using proximity operators and that same search performed in individual files? see, for example, this screenshot:

Am really quite stumped. And would appreciate your thoughts.

1 Like

Which is wrong? The global search (so finds documents in which no such proximity exists) or the document search (so fails to show such proximity although it exists)? In my case it’s the former.

Well, both (?). The global search generates a list of results that satisfies the rules of the proximity search operators. But that search turns out to be false when I perform the same search in the individual files, where it does not find internal instances of the rules of the proximity search operators in the document itself.

So i suppose the question then is: Why would the global search turn up a list of files that themselves do not contain instances of the things I searched for?

So the document/inspector search is factually correct when it shows “no occurrences”? So the error is in the global search?

Yes; @cgrunenberg in this case I could actually share a document which is found in the global search but does not actually contain the proximity in question if that helps.

An example document plus the full query would be useful, thanks.

The two of us really need to get back in sync :see_no_evil::crazy_face:

1 Like

full query: emergence BEFORE/4 "in the making"

sample document: Dropbox - Gary Tomlinson - A Million Years of Music- The Emergence of Human Modernity (Zone) (2015).pdf - Simplify your life

I have submitted two documents to you by PN.

I’m not finding any matches for emergence BEFORE/4 "in the making" but I have one in-document for emergence BEFORE/4 "in the middle"

Note the document isn’t found in the toolbar search but a match is found in the in-document search…

1 Like

strange, ok, now that works.

Would you be so kind as to try this file?: Dropbox - The Speculative Turn- Continental Materialism and Realism.pdf - Simplify your life

It’s an instance where the global search query of emergence BEFORE/4 "in the middle" includes this book among the global search results, but when that same query is performed in the document itself, no instances of emergence BEFORE/4 "in the middle" is to be found.

I don’t find that document with your search.
In fact, there appears to be only one occurrence of in the middle in that file and nowhere need emergence.

I’m thinking your index may be incorrect and a database rebuild in order.
@cgrunenberg ?

I’m bored, so I tried that; the documents which I sent to Criss are still found in global search, although there is no occurrence of the search term in the text, and the inspector duly finds none. So I think there is more to this than an index error.

Is this only occurring with searching PDFs in your testing?