Duplicates in dt3

josieduffy · May 31, 2019, 5:38pm

dt3 is still missing a lot of duplicates for me. i have it set on the less strict setting, but still a lot go unseen. any suggestions? sometimes the two docs have different tags…could that be it?

cgrunenberg · June 3, 2019, 11:33am

No, the tags don’t matter. Only the text content or thumbnail. What kind of files should be duplicates? Is their word count greater than 0 (see e.g. navigation bar or Word Count column of List view)?

nm1 · June 17, 2019, 5:55pm

I’m seeing a strange situation in DT3b3 where the word count of a duplicate (PDF) is reporting differently than the original instance’s word count. Presumably because of the word-count mismatch, the duplicate-identification engine in DT3 doesn’t recognize these PDFs as duplicates.

This is just one example, but this situation seems to be cropping up for me with a lot of unidentified dupe documents in DT3.

Any suggestions or explanation for the word-count mismatch?

BLUEFROG · June 18, 2019, 1:22am

On what basis are you expecting these to be actual duplicates? Did you manually duplicate them or are they the product of OCR or …?