Duplicates : suspect a bug and suggestion for clarification in help

rufus123 · February 24, 2022, 7:13am

preamble

by annotate I mean yellow highlights with native DT tools

I often convert items, for example to be able to annotate an email (convert to PDF) or to annotate a RTF item in DTTG (RTF → PDF)

I did some testing to better understand non strict duplicates

set my pref to UNcheck stricter recognition of duplicates (below)
created a duplicates smart group (below)
created a RTF item called test 4.rtf and did multiple conversions to md, HTML, PDF (paginated), PDF (one page)
wrote some annotations in both types of PDF (yellow highlight)

What I found

the RTF, HTML, PDF (one page) are all listed in the smart group of non strict duplicates which is normal behaviour according to the documentation but PDF (paginated) is not
the PDF (one page) remains a duplicate even after annotating

Conclusion concerning non strict duplicates

bug ? PDF (paginated, which is the format I always use) does not appear as a non strict duplicate
i suggest you add to the documentation the fact that a PDF with native annotations remains a non strict duplicate (which I think is a very good thing)

thank you

quote from documentation on duplicates
For example if you convert a rich text document to a plain text one, the contents will be the same in both and they will be shown as duplicates. You can force DEVONthink to consider the file type and size when detecting duplicates by enabling Stricter recognition of duplicates in Preferences > General .
DEVONtechnologies | How to Find and Remove Duplicates.

cgrunenberg · February 24, 2022, 7:35am

The layout of PDF documents and e.g. line & word breaks might affect the indexed text and therefore the recognition of duplicates.

rufus123 · February 24, 2022, 7:38am

yes, it makes sense. thank you.

system · February 23, 2025, 7:39am

This topic was automatically closed 1095 days after the last reply. New replies are no longer allowed.