How to clean up duplicates

Andreas76 · March 27, 2020, 11:27am

Now in the corona time I wanted to clean up my duplicate documents.

The duplicates recognized by DT are already correctly sorted in one folder and in a second time in another. Furthermore they are partly correctly named in the wrong folder and partly not yet in the right one.

So I don’t want to move all duplicates into the trash but I want to view the duplicates to decide if I want to keep one or the other.
It is absolutely not practicable to open every single duplicate via the inspector bar and navigate back and forward.
It would be nice to have a view where I could place the duplicates with the path directly underneath each other but I can’t find a meaningful sorting criteria.

What is the best way to proceed ?

Translated with www.DeepL.com/Translator (free version)

cgrunenberg · March 27, 2020, 11:52am

See e.g. Duplicates need to be moved to trash. You could also use Data > Convert > Duplicates to Replicants.

jmriggio · March 27, 2020, 3:54pm

That option seems to be unavailable. What conditions will enable? What exactly is a replicant? I really don’t need or want the duplicates as they are unnecessary and are the result of merging different folders from different computers to one DT3 database.

BLUEFROG · March 27, 2020, 4:25pm

You need to select duplicates. This is often easily done in the Duplicates smart group in a database.

What exactly is a replicant?

From Help > Documentation > Appendix > Glossary…

And Getting Started > DEVONthink Simplified…

Andreas76 · March 28, 2020, 12:23am

Sorry if I might ask stupid question but I don’t get it.
I already have a smart group to show up the dublicates.

I have about 2000 duplicates (because of historical migration and consolidating problems) and what I want to achieve is a view like this.

| dublicate1 | Name of first file | location of first file
| dublicate1 | different Name of second file | location of second file

| dublicate2 | Name of first file | location of first file
| dublicate2 | different Name of second file | location of second file

…

This way I could easily go through the list of files and choose in the first example the second file and in the second example the first file to delete. I don’t want to automate this to generally delete the first of the files.
Because the duplicate have different file names I am not able to have a view like this and show up the corresponding files together.

BLUEFROG · March 28, 2020, 3:29pm

If you’re handling these manually, you can see the location of duplicates in the Info inspector’s Instances dropdown.

Andreas76 · March 28, 2020, 8:31pm

Yes, but as I said this is not practicable to do that with 2.000 documents, clicking on every document, clicking on the listbox and navigating back and forward is kind of confusion and time intensive.

So there is only the automatic deletion with not possibility to have an impact of what to delete?
I am not aware of scripting but might a workaround be to have a script that loops through every duplicate to fill a custom field with a sequential number? I could sort a view on this column to have the documents together in the right order.

padillac · March 30, 2020, 7:10pm

Consider sorting by size. It’s not perfect (some distinct files have the same exact sizes), but the overwhelming majority of my duplicates list (over 4k) have the duplicated files right next to each other.

Andreas76 · April 1, 2020, 6:49am

thanx @padillac what a simple but great idea - in finder I would have done it the way by instinct. Don’t know why I wasn’t thinking of this within DT. Maybe I expected some KI stuff to give me an advanced view. In deed it works quite well so far.
But just for interest to make it more precise and standardized in future I am interested in if a script might be able to achieve what I was thinking of. Yes or no would be enough … I can tinker it myself for fun when I have time.