PDF-generating softwares put hyphens at the end of a line if word needs to be broken into two. The search algorithm in DT seems to consider these broken words as separate.
Here, I have attached a sample page.
sampe_page.pdf (28.8 KB)
Download the file, import it to Devonthink and search for the word “possessives”.
- How many occurrences does DT recognize?
- It gives me none.
But, if I search the same word in Acrobat Adobe, it gives me 1 occurrence, because Acrobat is intelligent enough to remove those carriage returns (recognized the word as one, broken due to technical reasons).
I think DT needs to do the same. The results of the co-occurrence and many other systems will be different (more accurate).