So I just had my primary DT research database get corrupted, and have spent some twenty four hours in a sheer panic, trying to save the three years worth of scholarly and teaching work that have been poured into it. I’m still trying to figure out the best way to fix things, but one thing is clear: when one has a package that is some 65 gb and something goes wrong with it, it’s a nightmare, no matter how well things are designed. One hopes that nothing ever goes wrong, and generally I have had no trouble with Devonthink. But given that the virtue of DT is that it allows one to build HUGE databases, the other side of the system is that when these fail, there needs to be multiple levels of support to get one’s data back.
So here are are a few thoughts on this, with some suggestions for features and safeguards that might help prevent anyone else having the day I just had…
The new DT Pro 2 file format, which stores everything in regular files is obviously superior to the old DT format. You now have some hope of getting your data back if DT becomes corrupted. The question is whether your organization of the data and your metadata can be recovered…
The problem is that if you have organized all your data in DTPRO, those files (for me, some 80K files) are very disorganized in the package, and it will take months if not years to resort a raw dump of your files.
just because your database passes the “verifying” check at the beginning of devonthink’s startup or opening of databases does not mean that your database is clean. Mine had some 400 errors in it, but I had not noticed them, and it’s very hard to use time machine to keep restoring versions to guess when the whole package might have been good.
I had thought, foolishly, that a “rebuild database” command would always work to rebuild even the most corrupted database. I was wrong. I have had the rebuild database commmand crash on me 20 times in the last 24 hours, running it on different iterations of the recovered file on two different computers.
Given that “rebuild database” can crash, it is really too bad that there is no safeguard built into the rebuild command that would allow it to start over where it left off after a crash. If it takes three hours on an octo core machine to rebuild your database, and it crashes half way through, this can add up to days of lost time. It was unbelievably frustrating to see DTPRO create 215 GB of stuff in the “Recover Folder”, start to import it to a new database, and have it crash! I literally couldn’t get the program to get that far again. I would really like to see the rebuild database process be able to recover from its own crash and continue the rebuild once the program is restarted (and if the problem comes from a particular file, it would be so nice to notice the user and go on with the rest of the rebuild).
Time Machine support? It would be really, really nice if DtPro had some sort of time machine support built into it. I actually had a (desperation) dream about being able to just click on time machine inside a DTGroup and have it show me the deleted or missing versions of the file and allow me to recover them from a TM backup. This would make good sense for a program that uses an intentionally uninterpretable filing system.
I found that back-in-time from Tri-edre software is the only thing that stood between me and suicide. It allows one to go through time machine backups and pull individual files much faster and more readily than does apple’s interface, which works very poorly inside DT’s packages and their indecipherable folders. Too bad back-in-time is not scriptable, or I’d also be able to use it to recover each “missing file” in DTpro from the backup as I find it, sending it the full path. Instead, I have to copy the path, do a search, and then put the file into place inside the DTpro package. It’s tedious surgery, and kills LOTS of time. If you have 400 missing files, you have to hunt through them in DT and them search for them in Time machine or BIT, you can drive yourself nuts. I still have a long weekend ahead of me.
I really, really wish that DTPro, when it verifies a database and finds missing files would tell me at startup. I also equally wish that it would create a “group” containing aliases to all missing files when it finds them, so I don’t have to hunt around for them, and slowly discover my data rot. This is imperative for those of us who don’t have 50 TB of backup space, because it is important to catch the deleted file before it gets pushed off the back end of a time machine expiration cycle. Please, I am begging here—and consider this an urgent feature request, along with resumable rebuilds.
If there was a way to tell dtpro to log the entire rebuild process to the console or a log file, so I knew which file it died on, that would be nice.
I do wonder whether there is enough rebuild information inside dt’s file system to actually rebuild everything from just the raw data. I would think that one should at least theoretically be able to destroy all the dtMeta files in the package and that the devontech_storage files would have enough information to rebuild the database from the nodes up. Nope.
I wonder if the problem with #10 has to do with nested replicants. I have quite some number of folders that have parents nested in children that are in turn replicated in the parent. In other words: contains replicants to who in turn have replicants of in them. Is this a no-no and is it what is causing my database to not rebuild? If so, this should be fixed or not permitted (I prefer the former of course )
Please, please consider a preference to give a “full report” of database checking at startup. I want to know the second my database goes bad, so I can fix it, rather than back up bad over top good.
I notice that my devonthink tends to crash a lot when the memory allocations for devonthink climb over 2.83GB. Is this a magical number for malloc failures? I wonder if DTPRo 2.0 has been adequately stress tested under low memory conditions. I tend to think my large database might be hitting the limits, but I wish I knew what the limit was, rather than imagining it a dark scary beast that will just kill me when I step off the magic trail.
Okay that was a rant, but hopefully a helpful one. I wanted to type it up while all this trama and injury is fresh in my mind. In short, I am hoping for :
- an option for a “full” verify and repair at startup
- an automatically aliasing of all “missing files” at said verify
- a resumeable rebuild command
- time machine support
- a complete node recovery “salvage” command that will rebuild dt’s storage and replicant structure.
- some guidance as to whether there are storage limits to dtpro under 32 bit malloc, or whether I just have “gremlins”. I need to know!
Don’t get me wrong, I love this program. But I’ve just been through a train wreck, and would like to my part to prevent future disasters for myself and others.
p.s. In order to be maximally constructive, I will post a note below this one giving some tips on how to rebuild when the rebuild command doesn’t work.