Would it be possible to add duplicate detection based on contents, rather than type+size?
It would be nice if this included metadata as well. For example, if I have two files and one has a URL and the other doesn’t, they wouldn’t be considered duplicates.
Duplicate detection doesn’t consider size and type unless you’ve enabled stricter duplicate detection in Preferences > General.
Metadata isn’t part of the content of a file so it wouldn’t factor into a contents-based duplicate detection. Development would have to assess extensions to the detection mechanism.
I have some little amount of files (PDF) that are completely different inside and they are marked as duplicates if I have disabled “strict” checking. However, I’m interested in this “soft” way to check duplicates because it is able to find very similar files that really are a duplicate that I want to get rid.
My solution is to have a tag called .false_match and then modify the duplicate smart group to ignore dulicates with that tag.
(BTW, I use point-starting-tag to indicate that is a “system” tag and not a normal one).
The rendered content is not different in this case. Both documents contain a single word: bookends. If the source of the document was not showing, it would certainly appear the files have the same content.
Development would have to assess modifying this behavior.