Duplicates : suspect a bug and suggestion for clarification in help

preamble

by annotate I mean yellow highlights with native DT tools

I often convert items, for example to be able to annotate an email (convert to PDF) or to annotate a RTF item in DTTG (RTF → PDF)

I did some testing to better understand non strict duplicates

  • set my pref to UNcheck stricter recognition of duplicates (below)
  • created a duplicates smart group (below)
  • created a RTF item called test 4.rtf and did multiple conversions to md, HTML, PDF (paginated), PDF (one page)
  • wrote some annotations in both types of PDF (yellow highlight)

What I found

  • the RTF, HTML, PDF (one page) are all listed in the smart group of non strict duplicates which is normal behaviour according to the documentation but PDF (paginated) is not
  • the PDF (one page) remains a duplicate even after annotating

Conclusion concerning non strict duplicates

  • bug ? PDF (paginated, which is the format I always use) does not appear as a non strict duplicate
  • i suggest you add to the documentation the fact that a PDF with native annotations remains a non strict duplicate (which I think is a very good thing)

thank you

quote from documentation on duplicates
For example if you convert a rich text document to a plain text one, the contents will be the same in both and they will be shown as duplicates. You can force DEVONthink to consider the file type and size when detecting duplicates by enabling Stricter recognition of duplicates in Preferences > General .
DEVONtechnologies | How to Find and Remove Duplicates.

The layout of PDF documents and e.g. line & word breaks might affect the indexed text and therefore the recognition of duplicates.

1 Like

yes, it makes sense. thank you.