Indexing is causing duplicate entries in DT

Has anyone noticed duplicate entries in DT for indexed files? Only started happening with the latest version.

I just tried with a different database and it’s reproducible so now I have to temporarily stop using the indexing feature while I figure this out. :cry:

I do not have a large number of indexed files (merely about 150) but can certainly confirm it’s not happening for me: there are no duplicates of my indexed files (using DEVONthink 3.8.2 and macOS 12.2.1).

Does DEVONthink show the items as duplicated? Were they indexed before (in an earlier version of DEVONthink) without duplication?

Clearly any more information you’re able to provide is going to be helpful in trying to trace the problem as I imagine if this were a common problem there would already be a number of threads here about it.

Stephen

1 Like

I have about 1,000 files in each database (at least the ones I experimented on). There are no duplicates in my source directory structure. I ended up with creating a new/fresh database and indexing. Will monitor to see if the problem occurs with the new databases.

I have 30,000 indexed items in a database and I’m not seeing any unexpected duplicates.

1 Like

I don’t have many; but not seeing the phenomenon… 3.8.2 on 12.2.1

Why are you indexing?
I’m guessing you’re indexing in a cloud-synced location.

Thanks to everyone that respond - I’m definitely going to chalk it up to an issue on my local computer.

In response to Jim - yes - I’m indexing a cloud-synced location. I haven’t had an issue in the past, and I’m guessing it’s a combination of an update from OneDrive and DT that probably presented a weird corner case. And yes, I know you’re probably going to remind me that DT does not technically support cloud-synced locations. :wink:

And Microsoft recently changed the default behaviour of their OneDrive product to exclusively store your files on their servers and deleting on your local device “to save you disk space”. Perhaps look into that as part of the root cause?

1 Like

You’re mixing two things here - sync and indexing. Indexing a cloud location is supported only if that location has a local mirror which contains the files in question. Until now, that was the behaviour shown by OneDrive. As I noted here that behaviour is changing; Microsoft have more details on those changes here.

Whilst other users have reported no problem with the change due to the original local OneDrive folder being replaced by a symlink to the new folder, indexing will still require that you make the local files permanent (see the Microsoft support document I referenced above on how to do that). Three things to do, then: first, before proceeding, please make sure you have a backup of the files in your OneDrive folder. Although to my knowledge there have been no reports here of data loss with regard these changes, based on my limited knowledge I still assume that there is a possibility for data loss in some cases; set the files to be permanently available locally; possibly change the pointer to the location of the files in DT from the previous location to the new location (again, the support document will provide more details). I would guess that those two changes might solve your problem.

The issue could have arisen - I guess - when you synced DT across two Macs which were using different OneDrive versions (and thus different file paths). Is that possible? Obviously then the solution would be to make sure all devices are using the same setup and software versions.

1 Like

Obviously then the solution would be to make sure all devices are using the same setup and software versions.

Including the most current version of DEVONthink: 3.8.2. :slight_smile:

@Blanc Thank you for this - I wasn’t even aware Microsoft was making “drastic” changes to the OneDrive application.

To answer some of the other questions: my folders are always pinned, and I don’t use DT across multiple Macs.

1 Like

and I don’t use DT across multiple Macs.

But you’re using OneDrive across multiple computers, correct?

Is there any rhyme or reason to the files showing duplication, e.g., a certain file type, files in a certain Finder folder, etc. ?

Not sure if I’m answering your question correctly but there is no duplication in my finder folder(s) or on OneDrive (just being explicit). The duplication occurred when I triggered an update to the index and only exists in DT.

Given the responses here and the changes in OneDrive, I now firmly believe it’s related to the changes in OneDrive. Why it didn’t occur previously or was it really triggered by the new update from DT - I have no clue. The creation of a new DB and repeating the indexing process is working just fine and I’m not seeing any issues since the “do-over”.

And yes, I’m using OneDrive on two computers - an Apple MBP and a Microsoft Surface device. I’ve been using this combination for a while now so nothing new was changed here.

Ok … I found the issue … it’s a OneDrive issue. I still have a few larger DBs (with index items) that I haven’t “fixed” as yet and was able to see why the duplication is occurring.

The “duplicate” entries are coming from ~/Library/CloudStorage/OneDrive … vs the original entries coming from ~/OneDrive …

3 Likes

Thanks for the investigation and follow-up. It’s very appreciated. :slight_smile:

1 Like

Duplicate items in the database having the same path? Or duplicated files in the filesystem? In addition, where are the indexed items located?

@wally said up a few posts he found cause – Microsoft One Drive.

Criss, the OP detailed their discoveries and solution a couple of posts further up.