OK. Looking for a way to handle the 5k+ duplicates in all my 20 databases I was searching the forum and found a 2019 thread, where none other than Jim “The Bluefrog” commentd that DT now (then) does it natively via Convert > Duplicates to Replicants.
Heck, I hadn’t noticed that until now, and it is a real LIFE SAVER.
The only thing I noticed I had to pay attention is when the duplicates have different names, but this is easy to do, sorting by size, and keeping an eye on the location, kind, dates. But it happened infrequently.
This is why I LOVE DevonThink. It has been what, almost 18 years I think (?) - every week I discover something new and exciting, and above all USEFUL.
Of course, I should, as Jim says, be always reading the documentation, Help, news hints, etc. I will, I promise.
And you can enable Preferences > General > General > Stricter recognition of duplicates. This will take file size and file type into account when detecting duplicates.
Done that! - I should probably try this on my test database, but now that you’re here
How are labels, tags, etc treated when duplicates are converted into replicants - or, you can just tell me “read the documentation” I’ll go there.
It’s not appearing to behave as expected.
When selecting the duplicates…
The Data > Convert > Duplicates to Replicants is removing the duplicates, leaving only one file.
One the resulting file, tags are preserved and applied…
Labels are lost.
Custom metadata is lost.
@cgrunenberg would have to comment on this further.
When selecting a file with duplicates, but only selecting that file, the replicant is created.
And as it is with many things in DEVONthink, context matters.
The attributes of the selected file will be preserved. This is logical as a replicant is an instance of a singular file.
Makes sense Jim. The other approach would be to do what some dedupe apps do, which is to display, for each record side-by-side the duplicates, and ask if the user wants to select one as master, or make the union of properties (sometimes not possible, so again asking to choose), But that’s of course tedious and you are 100% right, context, if you don’t want to lose data.
I’ve been just starting to use custom data, so for most of my items right now it is tags and labels, but what I’ve been doing is copy-pasting the labels onto the labelless siblings.
Another thing I was very pleased to see is I can add custome metadata columns to the list view - awesome!!
They’re good inquiries for sure.
I’m not sure of the behavior when multiple duplicates of the same file are selected, but when handling one duplicate file it works as it should from what I see.
Indeed - custom metadata are available as List view columns
I very rarely have 3 or more dupes - what I am doing is sorting and re-sorting the dupes list by name, size, location, and very occasionally spot 3 dupes. I am also copy-pasting the name that I find best onto the siblings, knowing that DT will then just have a single name.
The attributes are not merged (especially as that’s not even possible in case of e.g. different labels).