Files in database replaced by files from another - original files lost

Long term user of DEVONTHINK with many databases.
Opened a database that I update infrequently (DATA1) and searched for some key documents. Titles came up as expected; however, the files associated with the topic are in a completely independent DT3 Database (call it DATA2). Further searches found significant corruption with files from DATA2 having replaced DATA1 files (but Name is that for DATA1).
Did reindex, verify & repair, rebuild, etc.
Checked inside file contents and found (incorrect) files - original files do not seem to be any where in DATA1 contents.
DATA1 contains my original files BUT even my oldest backups are unlikely to have a sound database.
I suspect the culprit is the recent “improvement” to DT3 that improves overall database verification.
Suggestions please as these are important documents.

***. UPDATE. ***
Thank God for backups - I have found an old backup of the database and accessed the lost files. Can now progressively restore DB.
Notwithstanding, I would love to know how files from a completely unrelated database moved and replaced my original files. Transfer of one file to another database is only done on a case by case basis not hundreds of files. Guess I have a long task of checking every file.

I suspect it’s going to be helpful to know at least:

  • What version of macOS you are using
  • What exact version of DT3 you are using
  • Whether you are synchronising databases between devices and, if so, what method you use
  • Whether the relevant files were indexed or imported in the relevant database(s)

Stephen

1 Like

Thanks Stephen.

MacOS 11.6.2 (Big Sur)
DT3 Version 3.8
No synchronisation between devices only to iCloud (iCloudKit)
Files were imported over time (8 years) from various locations but indexed in DT3
My initial analysis of one section the database indicates around 300 inconsistencies - which I am presently fixing by using Finder (Add to Devonthink) then deleting the rubbish files.

Does File > Verify & Repair Database report any issues? So far it’s unclear what exactly is wrong and what might have caused it.

Verify and Repair Database does not report any issues.
Working through the corruptions I have found the files come from multiple DT3 databases (they are completely unrelated an some have not been opened for many years). Basically these files have simply replaced good ones.

As DEVONthink doesn’t mix databases or replace files on its own and as there’s no sync there aren’t many possible reasons left. Maybe scripts or smart rules? But as the files/folders are indexed, of course any user or app could change, duplicate or delete indexed items and it’s now impossible to figure out when/why this happened. In which folder are these indexed items located? Is it a cloud folder?

No the databases are held in a local folder on my Mac. iCloud (iCloudkit) is used for sync.
My present view is that the recent DT3 upgrade that is meant to improve database stability and accuracy is the contributor. As I have about 12 DT3 databases I progressively opened them post the upgrade and DT3 did its thing - multiple DBs were open at the same time during this process. Prior to the upgrade I did a verify and repair on all databases in prep for the upgrade. My backups (Carbon Copy Cloner) show the DBs had the right records in mid Oct 21 - I am stripping files from this backup to restore the DB. Once this is up to scratch I will do some checks on other DBs to see if they are impacted. I can find no pattern in the replacements - mainly PDFs and EML files - but my DBs mainly consist of these items.

Which version did you use before 3.8?

I believe it was 3.7 but I do not keep track of program versions. Basically, I update when a version has been in the wild for sometime, there seem to be no reports of major issues or micro updates and I believe I reasonably understand the changes.

I’m sorry, I’m not sure I understand all of what you are saying:

What does that mean? Were the files imported to DT (i.e. using the move to database command) from indexed locations? Or more succinctly: were the files which were impacted indexed or imported? This is an essential distinction.

Inside which file? Checked how?

In what sense is the database divided into sections?

Are you saying the files have not been opened for years, or the databases have not been opened for years?

Does this mean that only one single device (your Mac) syncs to iCloud, and no other devices sync to that cloud store?

Hi Blanc.
Terminology is the issue - the affected files were added to Devonthink then tagged. By adding I mean in Finder I use the Add to Devonthink script, in Mail I use Message | Add to Devonthink. They are not an external souce indexed in DT3 nor are they imported from another DT3 DB.
Checking files by visual inspection of contents, eg. if the file name indicated it was a Financial Statement for a period then I would expect to see that BUT in a lot of cases I simply got PDFs of other documents not even related to the database (but certainly in another DT3 database)
The database is set up in Group folders. For example the database is for a Trustee (or many Trusts). A group folder is setup for each Trust.
The corrupted database is only opened every three to six months to add new documents. Occasionally, I need to refer to older documents in that db. The other databases (which hold the documents appearing in the corrupted db) may be opened at least once per week, while others may not have been opened for two years. Files in those databases may be referred to OR the database may have new material added.
Syncing - your understanding is correct.
Hope this clarifies the position. My apologies if it was not clear in the first post.

Thanks for helping me understand; to me that sounds that as if the structure of the database (by which I mean the internal workings of the database) has become corrupted (rather than the files themselves). As I understand it, the database consists of the records and a table with pointers to those records and additional metadata. I wasn’t sure from your initial description, but it sounds to me as if that “table” is corrupted. DEVONthink actually makes internal backups of that data. I’m not sure how useful that will be, now that you have started making changes to the database. Those internal backups are specifically for use together with customer support. I would suggest you open a support ticket and work through this with DT support directly.

In that case File > Verify & Repair Database… should actually report any issues (as long as it wasn’t already repaired or rebuilt)

Yeah. I’m working from a position of too little knowledge here, but the alternative explanation - namely that the files themselves have become corrupted (which presumably would require the pointers to the files in whatever the modern-day equivalent of the file allocation table is to be corrupted) seems even less likely. But again, I admit to only rudimentary knowledge of the workings, which may well lead me to the wrong conclusions. My next step would be to see whether those internal backups do actually solve anything (trial & error) - but that’s not something I’m going to recommend somebody else do. That’s why I thought direct support via a ticket might be in order in this case; I hope that’s ok?

As always, I’d really appreciate feedback as to what the issue actually turns out to be;

I find myself wondering what role synchronisation might have played in this. Any synchronisation must (I would have thought) introduce some risk of corruption, even it is usually reliable. Rather than being an extra layer of security, I would have thought that “unnecessary” synchronisation would add to the risks to the data. Just a thought.

I admit to having thought exactly the same thing. The bizarre thing is that the problem seems to cross databases though; that seems completely atypical to me - the databases are completely separate from one another as far as I know. There are a few things which cross the borders of databases (for example, the index, presumably) which to me could explain corruption when searching, but not when actually viewing the records. To my knowledge none of these “cross-database” utilities is involved in sync. Again, all this based on inadequate direct knowledge of the workings of DT; I’m obviously happy for the team to provide alternate explanations based on knowledge rather than guesswork.

Screenshots showing the issue plus a detailed description what’s actually wrong would be definitely useful.

1 Like

Regrettably the material is highly sensitive so screenshots would not be possible.
I propose to continue working through restoration from backup material - which is now secured in several locations.
Thanks to all for comments, suggestions, etc.

It’s alright, I’m highly sensitive too :smiley: If you need any more assistance, come back here :slight_smile:

1 Like

UPDATE
After spending a good part of yesterday finding “corrupted” files (mainly PDFs) in the DT3 database (referenced earlier in topic as DATA1), adding the correct files to DATA1 plus tagging then deleting the “corrupted” file I felt I made progress.
This morning I thought I would run a File > Verify and Repair Database. Three orphaned files discovered with no other errors. On inspecting the orphans I found that they were the files that started this whole saga. Further, the files had been replaced with the corrupted files. A further check found others I had meticulously deleted (after adding back the good files) were also corrupted.
I now am going down several tracks, being:

  1. Create a brand new empty DT3 database (call is DATA1NEW) and use the Move command to take good files across. Files corrupted in DATA1 will not be transferred. The backup (good) files will be added to DATA1NEW manually through the Finder Script Add to Devonthink 3. This should mean the new database has absolutely no instance of a corrupted file.
  2. Checked my backup regimes. Carbon Copy Cloner - only a one way backup to backup media - considered not to be contributory. Sync.com to cloud drive - only one way - considered not to be contributory. DT3 iCloud (iCloud Kit) sync - low possibility contributor as there is no other device to which the iCloud sync is syncing to - thus it is effectively a one-way sync. Synology Drive - two way sync - possible contributor to issue.

Further action taken. DT3 databases taken out of Synology Drive sync process. DT3 databases isolated from any two way sync process (except iCloud (iCloud Kit)) which is only working with on device. Continuing restoration of DATA1NEW.

My present analysis - looking in the wider context - is that Synology Drive (two-way sync) is not handling files in the DT3 database store (that can be found in Finder by looking at Package Contents | filesnoindex). How and why may be to do with settings, file date stamps, etc. I will see if this brings stability to the whole affair.

This post is made only to assist others and is in no way a reflection of any concern I have with DT3 which is a solid product, in my view and that of many other users.

1 Like