Eml files missing / Database repeatedly corrupted

Hi all,

I have one (out of 9 in total) database that repeatedly misses files and appears to be damaged. Maybe it is the process or the file size (>20GB)?

Every year I am archiving my emails (approx. 20,000/yr) in that database, by simply dropping my eml. files into Devonthink. Works fine, until around 2018 I realized that lots of files were missing. The database appeared to be damaged (I verified, and checked the integrity). I exported the files (also from backups) and created a new database.

Now I realize that the issue continued - many files from 2018 to 2021 are missing again, and the new database is corrupted. So I’m back to square one. Backups are not an issue, but making a new database and retrieving old backups sucks.

  1. This seems to only apply to eml. files.
  2. Other databases do not have that problem, but none of them store eml. files. Nor are they as big as this one (this one is >20GB, others are 6GB max).
  3. The database has been copied over the years to new computers, of course, but other databases have been copied, too.

Currently a MacBook M1 Pro with OSX Monterey 12.2.1, DevonThink 3.8 Personal. Any ideas where I could look?

Thanks, Hans

Are or were the databases located in a cloud folder like iCloud Drive or Dropbox?

No, never ever. :wink:

Additional Question:

After dragging/exporting the folders from that database onto the desktop (assuming that existing files would be copied, and non-existing ones ignored), only about 50% made it to the new destination. Including file size: the Devon database is 21GB, the resulting folder 11GB. Where is the missing 10GB?

Or does that mean the files are available, but not reference/found?

Thanks, Hans

I guess I can answer my own question: I copied filed from one year directly from the Finder location inside the Devon Database onto the desktop, and it turned out that it’s aboit 50% of the files that Devonthink reports. Means, they are gone. Also, no orphans.

You are aware though, that a record will not necessarily be in the same finder location as another record just because it is in the same group within DEVONthink?

Also: are you sure all the files you expected to be present were actually all ever present? DEVONthink will not import a second email with the same ID as an already present email.

Perhpas that 50% are orphans (real files into database but not recorded), but a verify database should report them.

I’ve archived ~300,000 mails from since 1999 in three separate database (per decennium) and have not experienced any problems like this. I’ve mainly used mbox to import.

That’s not how it works. A DEVONthink database also takes space for the metadata (do you use a Spotlight index?). As e-mail is mostly text the databases are often 2x the size of the imported mail, because they’re text + metadata (which is almost the same as the original text).

Is there anything in the log? As @Blanc says: double message-ids don’t get imported twice normally. Are you sure all e-mails have unique message-ids?

If you know how to work a Terminal / shell - what does grep -r -i -h -e "^message-id\:.*$" * | sort | uniq | wc -l tell you from inside the directory your .eml files are? It’s not exact (e.g. with some bounces or forwards) but it will give you a good indication of the number of unique message-ids.

  • Are you filing copies of emails into other mailboxes in Mail?
  • Have you ever looked at DEVONthink’s Window > Log?
  1. Yes, but I looked at one of the smaller folders, as they’re organized by year depending on the creation date. In Devon it’s 87MB, dragged onto the desktop it’s just 8Mb. I then drilled down in the finder and went through every f***ing folder and copied files with a creation date from 2007 into another folder - still 8MB.

  2. Yes, the files are missing because Devon tells me so. See attached.
    Screen Shot 2022-05-10 at 20.31.30

no, no orphans in any folder

Good idea, but I get grep: invalid option–?

no filing of copies of emails. The log is full of “missing files”. ;-)))

As your databases are located in the Documents folder - did you ever synchronize this folder via iCloud Drive? Only the latest releases refuse to open databases located in this folder if it’s synchronized via iCloud Drive but earlier 2.x releases didn’t.

No, I do not sync my files with iCloud, Dropbox, or anything else.

Do you have backups of the database available for restoration?

Sure, but only back to 2018, and they all seem to miss those files. I am also not even sure anymore that I completely rebuilt the database in 2018, as I said, as all versions are corrupted, and files are missing in all years since 2006. Then again, even older Devon databases (not containing eml. files) are fine, never had an issue with Devon databases before.

I don’t know if there will be any pertinent information but you can hold the Option key in DEVONthink and choose Help > Report bug to start a support ticket. At least we could look at the logs.

1 Like

done! The logs say a lot about missing headers…