First “full” test completed – all 1185 e-mails with attachments out of 3605 in my 2024 archive processed in 35 minutes, resulting in 777 truncated e-mail messages successfully imported from the generated .mbox.
I’ll be able to skip those which are less than the base64 encoded size of my minimum size to leave in the e-mail, so that’s one improvement to implement.
I want to investigate the progress bar display.
I still haven’t put try loops in (happy it made it all the way through the e-mails.
And I want to put a “firstRun” function in to set minimum size to keep, etc.
But one thing I’ve realised today is that another benefit of extracting the attachments form the e-mails is not just decreased storage if I make duplicate attachments replicants (I’ll probably only do that selectively), but as base64 encoded data takes up 33% more space than the unencoded data, I will save some space, too!
Looking at the relative sizes of things, the imported attachments for those 777 e-mails total 732.2MB – base64 encoding would increase that to 976.3MB, so a saving of almost a quarter of GB of space on one mailbox alone.
The saving would be a little more due to the extra carriage return every 76 (usually) characters of encoded data, but that would make less than 1.5% difference in my calcs.
The .mbox with the truncated e-mail is 74.9MB, and the original untruncated e-mails in DEVONthink report as 1.1GB, so everything’s back-of-the-envelope consistent.
A good few days’ work.
A few loose ends before first pre-release here in the forums, but tomorrow after a little more loose end tying, I’ll be happy to share what I have for a few others to test.
Good night, y’all!
Sean