Huge email import (120+ GB) crashes — is DT made for this?

Hello there,

New to DT (DT4), my main objective switching to DT is to have a searchable database of my email archive, that works both on mobile and on desktop (trying to move away from Gmail in a privacy-respecting way). I understand Keep It might also be able to do the job, but some people mentioned it’s less reliable than DT. I also looked at other options like:

  • Proton and Tuta, but their search options are not ideal, or
  • MailArchiver X and MailStore, but neither has a sync with mobile, and more unusual options like
  • Notebooks App and Joplin, but they’re not really designed for that kind of task.

I understand DT is quite robust, and so I’d like to know if it’s possible in your experience, and what to look out for.

What I’ve tried so far:

Since I cannot test the full import, I wanted to know if it’s possible to import these quantities into DT and still have a satisfactory search and overall experience. If search also works on iOS, with a partially synced database.

If you have other ideas that would work better, I’d be very grateful for any advice!

Have a nice day!

This was most likely caused by virtual memory (especially if there’s not enough disk space left for sufficient virtual memory on the startup volume).

How many emails does the archive contain? Only plain text emails or also formatted or with attachments?

1 Like

Thank you for your help @cgrunenberg !

Thank you very much for this info!

Slightly more than 250 000 according to Gmail.

Formatted, with attachments, the full thing!

Advice here in numerous previous posts in the past is to do the import in batches. Perhaps make your selections by year, or by month, or whatever size allows successful import. All at once … well, maybe not.

3 Likes

And most attachments are e.g. documents like PDF?

1 Like

Thank you, that’s what I also imagined afterwards. I wanted to avoid splitting the large mbox, since many of the scripts I found online didn’t work in my setup. But now that I found one, I’m happy with splitting. :slight_smile:

What I meant with advice is also: is it actually possible to do what I want to do with DT? Since I can’t test my usecase before buying, I wanted to ask here.

I have to admit I don’t know the statistics. I suppose lots of documents (PDF), probably also some images, a few zip files, just the random files I got sent over the years. I used to use Gmail like a personal archive (not very neat, but good enough), and am now trying to move away to something more private and, why not, also more organised and efficient.

Although you want to import in batches … experiment how big or small works for you. You can put them all into the same database and no “splitting” needed, if that’s what you mean.

And you can put a lot of stuff into DEVONthink. That being said, don’t just make it a dumping ground for no reason other than you can do it. Just my view.

1 Like

Yes, that’s clear. I meant splitting the mbox-file to be able to do an import in batches. Google Takeout left me with one huge file. I could have made labels and exported them separately, but didn’t want to go through that. That’s why I had to split the mbox file (unless there is an option inside DT to import only part of an mbox, but I haven’t seen it).

I understand. I’ll figure out what to put in it and what not as time goes by. For now, I wanted to start with the email archive, and for that specific usecase, I don’t want to start sorting, I just want them in there and searchable. And I guess from your reply that should be possible. Thank you.

Yes. And yes, I have an email “dump” which goes unused, pretty much, in a dedicated email database.

1 Like

Are you saying you only had one massive mailbox? Google Takeout lets you choose which mailboxes to export and exporting many smaller mailboxes would have been the better option.

No, I had some structure, but I had none where I was sure that exporting all the sub-elements would in the end export the whole thing. That’s why I simply chose to export everything. Now that I have a working php script (that is also quite fast), I have my small mboxes — the 20 GB import looked like it would work alright (but I’ll have to wait until I have my licence to be able to test that).