DTTG 3 is corrupting files

thanks goodness I won’t - I wouldn’t have a clue what to do with them, I’m just some random guy :wink: but Eric will have them, and that’s a good thing :+1:

1 Like

My bad Blanc, newbie here, not sure who is DT and who is a random guy like me :wink:

That DEVONtechnologies guy would be me, the random but insanely helpful guy is @Blanc :slightly_smiling_face: Thank you for sending the logs.

4 Likes

The accounts of the DT staff carry a small green badge of a nautilus shell on their icons, that extends the actual icon.

3 Likes

What I don’t unterstand until now: I have 457 corrupted Files in two databases. One of the databases is the global Inbox which is created by Devonthink. The creation date of the corrupted files vary back to end of 2018.
This was the time I mainly switched from a Mac based to an iOS only Workflow. I created PDFs via a ScanSnap (and its OCR engine). These scanned ones and digital received pdfs (like email attachments) moved into Devonthink via the iOS files App.
So in my feelings, the corrupted files come from this iOS based workflow.

I have DTTG 2 still on my devices, alongside to DTTG 3 at the moment.
In my case I have the corrupted files spread in all DTTG 3 instances and also on my Mac DTPO 3, think because of the sync.


In the DTTG 2 instances I can access all these identified files! (The sync was disabled through the upgrade process to DTTG 3 I think.)

So maybe there is a possible way for you DT guys to make a recover routine?

Edit: I also just remembered that in two or three cases I had the case, that this iOS workflow via the files app didn’t gave the pdf file over to DTTG 2 correctly, because there weren’t accessible in DTTG 2. So I did the import workflow again over the files app. After the second try they were accessible in DTTG 2.

@MauriceB’s post further up suggests things are not quite so simple, or are multifaceted.

That is an important piece of information, especially considering that - if I understand correctly, that legacy data is still there even if DTTG2 has been deleted; DTTG3 deletes it sometime down the road (where on Earth did I read that? But I did…! Here; @eboehnisch I’m sure you will have thought of this long before I did, but is it possible to preserve that legacy data for the time being? I’m kind of assuming it might be possible to retrieve “lost” data from that source, even if DTTG2 has been uninstalled? Would DTTG2 “see” the data again, if it were reinstalled?)

Same here: The only file affected on my side was one added on the Mac, synced to DDTG and then opened via the share sheet in GoodReader on the i*OS device. Added a hightlight there which got transferred back to DTTG. And sometime after that the file size went downhill. Of course, having a backup helped.

Maybe you’re right @Blanc and I can’t be completely sure because it affects a lot of files over quite an amount of time in my case. But I think the important thing for me is, like also @chrillek said before, that it affects files which run trough the iOS share sheet. In my case my iOS workflow always went over this:

For the file usage on the iPhone Memory I also noticed this in my case a actual usage of 27,25 GB, but before the installation of DTTG 3 and the upgrade it was about 14,5 GB.

would you have had any reason to use the iOS share sheet on these files? That is sharing from or to DTTG?

I can concur that at least one of my 0-byte files still contains it’s data under DTTG2 (which I didn’t remove either) and that file hasn’t been moved from it’s location.

I must also say I was a bit surprised the syncing took quite long following the DTTG3 upgrade. It looked like DTTG3 started migrating all kind of files, but there didn’t seem any need to do so.

As far as I understand the problem seems to be a metadata issue since DTTG 3 does not actually download whole files during migration. Therefore it would be reasonable to assume the corrupted files might still be accessible from the sync store but need to be re-read completely by v3. For those files already in the ghost group such a strategy could be implemented as a voluntary fix in one of the next updates because doing more harm on them is unlikely.

1 Like

@eboehnisch Eric, I installed 3.02 this morning and I have two PDF documents in the Ghost smart group, and the file info for each shows the size is 0 bytes. However, the documents appear to be fine-one is 18 pages and the other is 24 pages, and all the content is normal. I don’t know if this info is helpful at all, but I wanted to pass it on just in case.

I can add to that the following:

I’ve fairly accurately pinpointed the time when the 0-bytes appeared, based on an inspection of TM backups of my global inbox.

One of my 0-byte files was a noticeable 50MB zip file that was (and should be) present in the inbox. It’s also the file that is still available in DTTG2 I mentioned above.

Feb 10
release DTTG3 (is this correct?)

Feb 10
updated two of my iOS devices with DTTG3

Feb 11-12
trouble using family sharing

Feb 13
bought second DTTG3 license
synced that iOS device
sync issues with some error about data loss
removed DTTG3 from that device

I’ve now copied the Inbox packages from Time Machine dating from Feb 10 at 1AM (I was asleep), Feb 11 and Feb 15. What I found was the following:

  • All inboxes have an identical size in Finder (about 400MB)
  • The Feb 10 Inbox holds the 50MB zip according to Finder
  • The Feb 11 and 15 Inbox show the zip as 0-bytes, but the total Inbox package size has not changed

Update: the package size is clearly wrong. Finder correctly calculates the expected 50MB difference in the Files.noindex folder when I compare the packages that were stored in Time Machine before and after February 10th. So:
- the packages all show about 400MB
- the Files.noindex folder has decreased as expected by 50MB in the package of February 11th

Update2: according to Finder in the package of February 11th (one day after the upgrade) three zero-byte files were modified on February 10th around 13:55. A file I happened to annotate by accident was changed on February 10th at 13:54. The metadate file was changed on February 10th at 14:01.

2 Likes

Thank you for letting us know. That’s just wrong metadata. Version 3.0.3 will repair that on the fly.

1 Like

Any chance that you could send us zipped copies of all these inboxes? This might help trouble-shooting the issue.

During the migration of the old data store no data is downloaded at all. What happens is that V3 makes a copy of the folder that contains the database and all accompanying files. With APFS that just takes a fraction of a second and consumes no additional space. It then uses the copy, migrates the database scheme, and that’s it. Individual files are, at least not intentionally, touched. All error messages like “Couldn’t move … into the database package” and similar are not part of the migration from V2 to V3 but happen later, as part of a sync.

That’s why our theories go into the direction that the problem is actually caused by problems in the sync store itself that is then downloaded to DTTG3.

If anyone of you has a sync store on WebDAV or Dropbox that, presumably, holds the problem: Could you share that sync store (.dtSyncStore folder) with us so that we can have a look?

That sounds correct as DEVONthink To Go 3 is technically copying the version 2 data store. But as it’s an APFS copy (Apple’s new file system) it doesn’t physically use all that space.

I would, but I do not find it. Using WEBdav on a Synology with DSM 6.x. In my DEVONthink folder, I have
.DS_Store DT.dtCloud @eaDir #recycle
and in DT.dtCloud I see

09b8bafd2d664164cac0aeab533876724187c0f65e56255174ce5f25903b375c
4071363aacbeb8ea1ad26584a4ec9463c3fa46d3bbc31acf0ac0955569ba9883
7cba81b0c71fc3f8b7135c7d6a933cd51baecb26d8c3cf66c23197c51db572e5
7ce2ed434f942e4d75d0a25aa5a3cb663163042e75cf948dc3ada0ec4f1c7a29
9e99eeb4607550a68d0c330d7ec6685542067d0f4f91d78e06004ff6a193f669
c37f1dfa93bf26dd2c873af5546e6f163a3f01d02e7fdbec7738e5cfcd9eb98d
e2cd4d73d5b9ace5b7a4d75333e5d3a3e054201cff3f539ca3bc8491874b3784
inbox

All these entries are directories. I did a find . -name .dtSyncStore, which didn’t turn up anything.

Thanks for the 3.0.2 update this morning and the clear instructions. I was able to sort out (replace or delete) the ghost files before my coffee was done, turn automatic synchronization back on, and resume marveling at the powerful database in my pocket.

5 Likes

Thanks for all your efforts on this issue.

While I did have 13 ghost files (love that term, btw), most of them were some variation of “untitled.md” or equivalent — so my suspicion is that they just happen to have been flagged by your ghost criteria, and aren’t the result of anything to do with upgrading from DEVONthink To Go 2.x > 3.0.

Onward and upward!