UUID issue with “previously imported“ emails

I have been making different tests with email import and archiving from Apple Mail to DT3 lately, to gain familiarity with the different ways to import, archive, and in general deal with email using DT3.

I am in the process of dealing with a multi-year backlog of old emails which I’d like to keep in a searchable database on DT3, while at the same time implement new workflows to keep Mail.app, DT3 and OmniFocus working nicely as a team. So far so good.

The only blocking issue I have is with the pesky UUID issue with email that I have already imported sometimes in the past, something which is preventing me from finalizing the import of a long MBOX file with the old Archive mailbox.

Since the Archive is some 18G large, with little more than 73k messages, I tried to import it in a single shot but the import failed around 35% or so.

Now I have split th MBOX file in 3 chunks (very easy to do on the CLI, thanks to awk!) and, even after creating an entirely new database the import reaches the end but gives the pesky “10130 previously imported“ message in the log.

Effectively, it looks like some 10k messages have their UUIDs already somewhere within the guts of my test DT3 setup.

Why does this happen with a brand new database, to begin with? Shouldn’t it be the case that UUIDs are only kept for documents which are in the open database? Maybe they had been imported in the Global Inbox and deleted, granted, so this might be the answer.

Thus the question now becomes, how can I fix this issue? Can the UUID “database“ hidden somewhere in the guts of DT3 be purged altogether?

This is a different test machine that I am using, so I would not have a problem in clearing up my DT3 installation and starting from a clean slate. Can I do this with some CLI commands or do I need to completely uninstall and reinstall DT3?

I am quite good with the Unix CLI so I don’t have any issue in “going under the hood“ if needed.

I very much look forward to using DT3 as a working repository for email moving forward!

Thanks,

Luca

What exactly did fail, was anything logged to Windows > Log?

As DEVONthink looks for the message-ID only in this database and as it’s a new database, it’s more likely that lots of messages have the same ID unfortunately. E.g. especially buggy mailing lists and other automated systems could cause this in the past.

Hallo Christian,

no, unfortunately. DT3 just crashed and its window disappeared, after being stuck at importing on the same number of messages for about a half hour. Some 11k messages had been imported, this on a 2014 MacMini with i5 processor, running the latest version of Catalina and pretty much no other application than DT3.

hmmm, this is bizarre, as in other runs in the previous days on the same machine, I did not see this same issue, and more messages had been imported.

What about imports done in the Global Inbox? Could that be the issue? That database is always open, isn’t it?

How is the UUID for email messages calculated? Is it really only the message-ID coming from the RFC822 headers?

Somewhere I had read that it was a combination of the message-ID, message size and some other data that would be the seed to create the UUID, precisely to avoid the issue with the same message-ID but effectively a different email, given its different size (and possibly some other metadata).

How can I start fresh with a brand new installation of DT3 on my machine, by keeping the same license?

Do I just remove DT3 and all its hidden plist files (thanks Hazel for sweeping the dust from under the carpet for me!), and then proceed to a new installation of DT3?

Or do I need to de-license the old DT3 installation before removing it?

Bare in mind that the license will be on the exact same machine, of course…

Keen to test this in detail, as I don’t really want to mess up my email in my new workflows moving forward… thanks for any help!

Bye, Luca

Please choose Help > Report Bug while pressing the Alt modifier key and send the result to cgrunenberg - at - devon-technologies.com - thanks!

All other databases including the inbox don’t matter.

By default the message ID is used, only in case of no message ID (again only due to buggy software sending emails) a hash is used.

Actually I don’t think that this will make a difference. A screenshot of Preferences > Email might be useful.

@cgrunenberg debugging information sent. Vielen Dank! L

Hi @cgrunenberg,

I have never followed up with you on the duplicate UUID issue after all, so here I am.

My bad. My fault entirely. There were real duplicate emails in the MBOX files, and thus DT3 was behaving in the correct way al the time!

Yet, your suggestions on the settings to have in the email import page made a real difference, and I was able to sort out the full import without issues.

Many thanks for your always excellent support!

Bye, Luca

2 Likes

Something else to notice - and I’m away from my desktop at the moment - when you do the import, in the right hand pane you see the destination, which I don’t think is necessarily the database you have open.

The upside is that you can do the import with the wrong database selected without anything annoying happening. The downside is that if you want to import to the current database, you may have to adjust the destination.

which I don’t think is necessarily the database you have open.

if you want to import to the current database, you may have to adjust the destination.

Correct on both counts.