NEAR search not doing what was expected....

Hello all,

Is this possibly a simple OCR issue - i.e. the overlaid text is not coming to the party (worryingly :cry: ), or am I missing something obvious?

Below are 3 screengrabs - not something I was initially searching for, but went this way to test if I wasn’t doing something wrong.

Far NEAR Evolving - 001_Screen Shot 2013-07-21 at 8.42.41 AM.png

“Fuzzy” is selected - but removed it does nothing different. “Content” is being searched, in my “UK” data group.

Only two results generated:

First one - correct one:

Far NEAR Evolving - 002_Screen Shot 2013-07-21 at 8.41.55 AM.png

Second one - obviously(?) not correct:

Any thoughts?

It appears that the score for the second document (“UK_Wedderburn…” – the one with the “false” result") is higher than the score for the first (“US_Gould…”). Those scores would be an additional error, it seems.

I located a copy of the “UK_Wedderburn…” document here. With these settings, I am unable to reproduce the false result reported in the OP:

My copy had 14 instances of “far” and none of “evolving”. If you grab that other copy I mentioned, put it alongside the two you already have, and continue to get the false results, then perhaps sending both documents to Support would give staff an opportunity to test the situation.

Korm - thank you for going to all that trouble! :blush:

I’ve just ran the search again.
Wedderburn is still there, only highlighting the “far” - I hadn’t even noticed its “score” was higher, which is obviously problematic.

I’m going to re-download a fresh copy of Wedderburn, and run it again…

[EDIT]

As an aside - “far” appears on pages 4, 5, 9, 12, 13 & 20. “Evolving” appears on pages 13 & 18.

The “far” picked up in the screengrab above, was on page 4.

Nope. Done it again.

Granted - it was precisely the same article - but obtained through “official” channels, off the LexisNexis database. It’s a word doc converted to PDF, and is fully searchable in preview.

Is this a possible bug? Should I send it through to someone at DTP?

With all the above being said - could this mean I would have problems doing similar NEAR searches with other terms? Think I’ll pull a Windows trick, and reboot - maybe that might make a difference… :confused:

I got the same results as korm, using the document that he linked to.

A document should only appear in the results list if there is a match for “Far” NEAR/20 “evolving”, however all instances of ‘far’ and ‘evolving’ will be highlighted in the document.

:blush:

A penny has just dropped.

I assumed that the search function would ONLY highlight the actual instance(s) where “Far” appeared 20 words or less NEAR to “Evolving”…

It never occurred to me that it would also highlight all the appearances of words being searched for. I will do a few more tests, and see if my understanding as above holds true!

Thanks for the help - played around with a few searches, and now it makes sense.

I consider myself as someone who knows their way around a computer - but goodness me, DTP as had me feeling idiotic on more than one occasion over the past few months! :wink:

I will need to tweak my searches to minimise the likelihood of needing to skip through dozens of individual search-hits in the identified documents, before getting to the Term1 NEAR Term2 jackpot. But that’s a small price to pay.

Thanks for the updates @Greg & @Cassady. Makes me wonder what the proper syntax for term1 NEAR term2 would be in order to get the desired outcome.

I’m not sure I’ve ever been able to get NEAR working for me without pulling down all instances of both TERMs.

In fact, when I try to search these forums for instances of NEAR in the “topic titles only”, nothing comes up. Not even this thread.

In browsing around, I came across this thread from last month that might be of interest.

[url]NEAR/BEFORE/AFTER searches: Highlighting their occurrences?]

Thanks, chatoyer. There’s more about search term highlighting in:

Re: Strange Behaviour with SEARCH dialog

Thanks for all the links!

Whilst it would obviously be super useful if the Search function was able to exclude highlighting the individual terms, and thereby - only present the Term1 NEAR/20 Term2 hit - it is what it is. I cannot begin to fathom some of the answers by the likes of Bill, explaining why it does that - I’ll happily leave that to the people who actually know something about programming. I’m not one of those people! :laughing:

Armed with the above, I will be able to narrow down future searches, to get to what I need.
I hope this might save someone else some time in the future, too!