Deleting Duplicate Files

Every new user of DEVONthink has to begin somewhere. So. I’ve mistakenly included a duplicate of each new file I’ve entered. I’d like to clean up my database. Would someone please tell me, baby step by baby step, how to get rid of all these duplicates? Thanks for your patience.

Just create a smart group with the condition Instance is Duplicate. Afterwards select all results of the smart group and choose Scripts > Data > Move Duplicates to Trash.

Thank you for your prompt reply.
One of the challenges in using Devonthink is terminology with which I’m unfamiliar.
Am I correct that your word “condition” is synonymous with “operators and wildcards” when I create a smart group?
As you directed, I entered “Instance is Duplicate” in that space.
I expected to see hundreds of files.
Instead I found one.
As an experiment,I “selected” (highlighted?) that one file in the smart group and went to Scripts>Data>Move Duplicates to File.
And nothing happened.
What am I missing?
Again, thank you for your help.

No. See screenshot:

Criss, You show an example screen shot for a smart search for duplicate files. I’m trying to do the same thing, but “Instance” is not in the dropdown box you used. I’m using DT3b2. Has it been removed? Is there an alternative?
–Thanks

The former “Record” and “Instance” conditions have been merged, see “Item” condition.

It is not a complicated process to:
Search “Email” Group > Any of the following are true:
Kind is Any Document
Item is Duplicated >Ok

Select all the items that are shown as duplicates (I am assuming that the MDTT script will leave 1 of the items behind)

Scripts> Data>Move Duplicates To Trash

Unfortunately it does not move items to Trash. It does not do anything.

Script:
– Move duplicates from selection to trash
– Created by Christian Grunenberg on Fri Feb 20 2009.
– Copyright © 2009-2014. All rights reserved.

tell application id “DNtp”
try
set this_selection to the selection
if this_selection is {} then error “Please select some contents.”
repeat with this_record in this_selection
if number of duplicates of this_record > 0 then
set this_database to database of this_record
move record this_record to trash group of this_database
end if
end repeat
on error error_message number error_number
if the error_number is not -128 then display alert “DEVONthink” message error_message as warning
end try
end tell

Could you please screenshots before/after executing the script? Thanks.

Hello. Thank you.

  1. Identify duplicates
  2. Select all duplicates
  3. Move duplicates to Trash
  4. After executing script. Same as starting point.

Anything logged to Windows > Log? Is the database read-only?

Log: Nothing gets logged
Database Read Only: I don’t know where it would be changed to Read Only. Not shown in Database Properties.

Question: Can the Script Editor app be the cause of the script not activating? I was doing some tutorials and could have done something to it.

This shouldn’t be an issue if you didn’t modif this script.

The script is as I posted it. Which was your original script.

The conditions of the smart group are wrong. Most, maybe even all of the returned emails are not duplicates. This should be changed to “All”:

Bildschirmfoto 2020-11-30 um 15.28.14

Is there a crossed-pencil icon to the right of the database’s name in the Navigate sidebar?

Found that the problem was that there were messages that were identical in title and size but when I compared them they had a time stamp that was a few days, hours or even seconds in difference between each.

So I consider this closed. Thank you.

One last question related to searching duplicates, when searching using this search query (with All)
is the search just selecting the ‘extra copies’ and leaving at least 1 version somewhere in the database?
I am so afraid to do a blanket search and delete, being fearful that both copies of a message are being deleted. Its paranoia gained from experience.

No, the search is returning all instances of the duplicates. Select all the duplicates you want to deal with, then run the script. It will preserve the last imported file.

Sorry but I am a slow guy. If I run a duplicate search and it returns that I have 3 copies of a document I will have to select 2 of them and then run the script. This will get rid of the two and leave the one.

No worries!
No, you select all three duplicates and run the script. The last imported one will be preserved.

If you want to preserve a specific one, just select the other two and delete them.

Thanks, your answer is exactly what I needed not worry when I find hundreds (if not thousands) of duplicates. Q: Am I throwing out the baby with the bathwater? A: Nope. Just the water. Again appreciated.

1 Like