Indexing is causing duplicate entries in DT

I have 30,000 indexed items in a database and I’m not seeing any unexpected duplicates.

1 Like

I don’t have many; but not seeing the phenomenon… 3.8.2 on 12.2.1

Why are you indexing?
I’m guessing you’re indexing in a cloud-synced location.

Thanks to everyone that respond - I’m definitely going to chalk it up to an issue on my local computer.

In response to Jim - yes - I’m indexing a cloud-synced location. I haven’t had an issue in the past, and I’m guessing it’s a combination of an update from OneDrive and DT that probably presented a weird corner case. And yes, I know you’re probably going to remind me that DT does not technically support cloud-synced locations. :wink:

And Microsoft recently changed the default behaviour of their OneDrive product to exclusively store your files on their servers and deleting on your local device “to save you disk space”. Perhaps look into that as part of the root cause?

1 Like

You’re mixing two things here - sync and indexing. Indexing a cloud location is supported only if that location has a local mirror which contains the files in question. Until now, that was the behaviour shown by OneDrive. As I noted here that behaviour is changing; Microsoft have more details on those changes here.

Whilst other users have reported no problem with the change due to the original local OneDrive folder being replaced by a symlink to the new folder, indexing will still require that you make the local files permanent (see the Microsoft support document I referenced above on how to do that). Three things to do, then: first, before proceeding, please make sure you have a backup of the files in your OneDrive folder. Although to my knowledge there have been no reports here of data loss with regard these changes, based on my limited knowledge I still assume that there is a possibility for data loss in some cases; set the files to be permanently available locally; possibly change the pointer to the location of the files in DT from the previous location to the new location (again, the support document will provide more details). I would guess that those two changes might solve your problem.

The issue could have arisen - I guess - when you synced DT across two Macs which were using different OneDrive versions (and thus different file paths). Is that possible? Obviously then the solution would be to make sure all devices are using the same setup and software versions.

1 Like

Obviously then the solution would be to make sure all devices are using the same setup and software versions.

Including the most current version of DEVONthink: 3.8.2. :slight_smile:

@Blanc Thank you for this - I wasn’t even aware Microsoft was making “drastic” changes to the OneDrive application.

To answer some of the other questions: my folders are always pinned, and I don’t use DT across multiple Macs.

1 Like

and I don’t use DT across multiple Macs.

But you’re using OneDrive across multiple computers, correct?

Is there any rhyme or reason to the files showing duplication, e.g., a certain file type, files in a certain Finder folder, etc. ?

Not sure if I’m answering your question correctly but there is no duplication in my finder folder(s) or on OneDrive (just being explicit). The duplication occurred when I triggered an update to the index and only exists in DT.

Given the responses here and the changes in OneDrive, I now firmly believe it’s related to the changes in OneDrive. Why it didn’t occur previously or was it really triggered by the new update from DT - I have no clue. The creation of a new DB and repeating the indexing process is working just fine and I’m not seeing any issues since the “do-over”.

And yes, I’m using OneDrive on two computers - an Apple MBP and a Microsoft Surface device. I’ve been using this combination for a while now so nothing new was changed here.

Ok … I found the issue … it’s a OneDrive issue. I still have a few larger DBs (with index items) that I haven’t “fixed” as yet and was able to see why the duplication is occurring.

The “duplicate” entries are coming from ~/Library/CloudStorage/OneDrive … vs the original entries coming from ~/OneDrive …

3 Likes

Thanks for the investigation and follow-up. It’s very appreciated. :slight_smile:

1 Like

Duplicate items in the database having the same path? Or duplicated files in the filesystem? In addition, where are the indexed items located?

@wally said up a few posts he found cause – Microsoft One Drive.

Criss, the OP detailed their discoveries and solution a couple of posts further up.

I mentioned unexplained recent duplicates in this post (Deleting Indexed items - #7 by mikebore)

Since then I have searched for files with “copy” in the title and found several others in places I haven’t been to recently.

Where are your indexed items located?

Dropbox

Before I started with DT I saved a separate copy of all my documents. I have just looked in this at old folders from years ago and these do not have the duplicates I now have.

Like an earlier poster I only started noticing these unexplained duplicates when they appeared for files I modify weekly so something has changed recently.

I guess you are likely to say it is a Dropbox problem and you may be correct. Whether it is Dropbox or DEVONthink, I am thinking of going back to only using DT for imported files. Not just for this issue. I am reaching an age where I value simplicity more than extra capability and using DT on four sync’d devices with indexed files is not as simple as when I used all imported.

Is the option to duplicate items in case of conflicts enabled on one or more devices? Did you recently clean a sync store?