Duplicates appearing?

I am indexing some folders and I have more and more duplicates appearing. There is only one file in the Mac folder but it show has duplicate in DT database. The both link to the same file in finder.
I can delete those duplicates but sometimes duplicates are created for a good reason (issue during sync…).
How can I avoid the duplicate creation ?
Is a “duplicate” file within DT exactly the same as the original ? (no idea which one is the original, which one is the duplicate) which means I can delete with no doubts.
My database is 8390 files, I add absolutely no duplicate 3 months ago and now have 486.
Any help would be greatly appreciated.
Thanks

What’s the path of these duplicates and which version of DEVONthink do you use?

I think most of the files are in my documents folder (iCloud)
Devon Think 3.8.6

As the indexed files are in iCloud, do you have the iCloud setting “Optimise Mac Storage” on or off?

Yes, it is on.

For what it’s worth, I’ve been seeing the same issue though all of my files are on local storage.

DEVONthink 3.8.6, syncing via Bonjour.

It’s weird, I can’t see any pattern or trigger. It’s across all of my databases—which are all indexed.

Huh, five more since yesterday.

On local storage where?

The system drive. And I don’t have any relevant smart rules running.

I’d thought it might have to do with sync (to another mac, iPad and iPhone) but then I’d expect to see correlation with created or modified. Strange.

Where on the system drive?

~/Documents/DEVONfiles/databaseName/indexedFolders

In DEVONthink I created empty databases and then added the indexed folders.

Bad idea. That way, iCloud’s “intelligent” algorithm will do all kinds of things to your data, and DT will be confused because files suddenly disappear or reappear.

Don’t index data until you’re sure that you fully understand how it works and what the consequences are!

1 Like

Are you also using macOS’ disk management with iCloud?

No, Desktop and Documents are excluded from iCloud Drive.

  • Are these files you’re actively editing?
  • What file types?

I have been pondering a post on Duplicates but have refrained so far until I have a pattern figured out better. But maybe it is related so I will raise the question.

I have about 5 databases which I sync among about 5 different devices (4 computers plus iPhone). It has worked well for years.

My main database got too large with cases I need to store but do not actively reference daily. So I created a database which I call “Archived Completed Cases” and when I am done with a case I use the Move feature in the menu to move the entire group to the Archived Completed Cases database. That Archived Completed Cases database is only on my main computer - not on any other devices.

I have noticed that after I move a case to the Archived Completed Cases database, over the next few days often the group is re-created in my regular working database. But is only has recent items, not the entire contents of the original group.

I have double-checked and never have found that I have lost any documents. The incomplete"ghost" group presumably appears as a result of a sync from one of the other computers and it has the same UUID of the original group. So I wind up with the same group UUID in two different databases, one complete and the other with duplicates of the most recent items.

Ok, I switched off the “Optimise Mac Storage”.
Now, how do I “clean” my DT database ?

Since all your records are indexed, you could try to rebuild the index. That has been discussed here and it is most probably explained in the handbook, too.

1 Like

You can’t “clean” a database. The term “clean” relates to deleting the sync files for a database on a remote synch site. For more information, see TROUBLESHOOTING chapter in the outstanding DEVONthink Manual (page 183 in Version 3.8.6 PDF) or in Help.

For dealing with duplicates, lots of posts here. First one found was a DEVONtechnologies blog DEVONtechnologies | Dealing with Duplicates in DEVONthink

2 Likes

Is anything in the Trash of the working database?

Interesting question

Yes - I do not always empty trash routinely and may often have older versions of files in the trash.

I gather you are suggesting the Trash may be syncing and going into a newly re-created Ghost group rather than syncing to a Trash account?