Confused on the correct way to backup all my devonthink data, advice appreciate :)

We would not recommend you treat this as your primary, and certainly not your only backup.

partly due to some bugs that I’ve experienced in DT3 related to the unresponsiveness of dialogue windows resulting in the failure to register (repeated) clicks on buttons, and the selection of files and folders.

Can’t say that DT3 being buggy when performing the aforementioned simple actions have instilled confidence in DT3’s ability to upkeep its databases’ integrity.

You are not describing normal or inherent behavior here so this should not be your criteria.
Also, have you opened a support ticket regarding said issues?

There’s also the issue of possible database corruption that I haven’t looked into which would be fatal for a “knowledge worker” like myself.

Where are you getting this idea about “database corruption”?

as well as LaTeX editors that concern both the code and compile the material into a pdf file.

If you are using third-party applications that generate temp or collateral files, indexing would be a likely method for those apps. However, those collateral files will very likely be included in your databases.

2 Likes

There are some pretty advanced search engine tools already out there. What do you hope to gain from DT that you can’t get from something like Houdah Spot?

1 Like

In addition to all of the other comments above on this very long thread, you’ve ended up with a relatively (compared to what it could be) complex environment. Whatever methods you pick for backup, it would be good if you test some restores from those backups to make sure you can get back what you backup. Meantime, @bluefrog makes some excellent points and questions.

1 Like

You might want to think again. If you backup only the paths, where does that leave you when your disk crashes or you inadvertently delete a file/folder/volume?
What you envisage is no backup at all. It only burns CPU cycles.

Backup concepts have been discussed thoroughly here before. No need to repeat all that, please use the search function.

1 Like

“Database corruption” isn’t a thing. Have you found any posts reporting that it has happened in the last few years where it wasn’t caused by a specific action? I suspect the answer is no, because corruption rarely happens and if it does it’s usually a file or two, not an entire database. In addition, DT has safeguarding measures for your databases that you should be running regularly (I run checks weekly), which would alert you to any potential problems occurring with a file.

I don’t understand why you would do back-ups per app. This seems a huge waste of time and the chance of human error is high. Just back up your whole drive like tech developers have recommended for decades! I’m a fan of doing a second backup just of DT, but there’s no point in doing that if you’re just indexing as the files don’t reside in DT and you’re not backing up much of use. (If anything that’s another tick in favour of importing as far as I’m concerned, since you can then back up your database and be confident you’ve not missed any important files.)

I’m also not convinced that indexing all your locations is a good idea, though it’s your choice. There are so many moving parts and it will be so easy to break something. Most of us (as far as I can tell from chats in the forum) index a handful of locations if that (I personally index two). You need to understand exactly what is indexed and how indexing works so that everything works as expected, because actions you take in the native apps (e.g. Obsidian) could break things in your indexed DT database.

Re: Obsidian, you do need to index your vault, Obsidian can’t read it if it’s in DT. There are many people running both apps and posts in both our forum and theirs about how to use it and different workflows, but you definitely do want that vault folder indexed. (And backed up!!)

3 Likes

There must be something running on your Mac that’s getting in the way. For instance, I’ve seen Keyboard Maestro mysteriously intercept hot keys in Apple Numbers that I don’t think it should have.

For me, Devonthink is sufficiently reliable. I have experienced data loss with apps most people trust. Word, for example. I can’t remember losing anything in Devonthink and I run the check file integrity function on each of my Devonthink databases every week.

When you import files, they are moved but not modified and they don’t disappear into a database. A quick peek in the contents of a Devonthink database package reveals that files of all kinds are just files contained in regular folders. You could uninstall Devonthink and still get to all your files.

You can also export to web format, which lets a web browser navigate copies of your Devonthink files.

1 Like

Amongst other more general & system-focused backups, I’ve started using Arq to back up specific databases to an external HDD for extra assurance. Would it be an issue to use one external drive for multiple databases being backed up (all separated by simple Finder folders within the drive) or would it be best to keep one drive per specific database to prevent any possible issues or any kind of possible cross-interference?

There is a whole section “A Word about Backups” in the outstanding “DEVONthink Manual” that you should read and understand. Page 19 of the 3.9.6 version, and in the app’s Help system.

I don’t really understand your concern about multiple disks or “cross-interference”, so I cannot comment. In any event, I recommend you setup a 3-2-1 backup regime. Lots of articles on the “web” about what that means. Also lots of past posts here on how others backup.

2 Likes

I appreciate that there is a manual covering most broad topics, though I did indicate that mine was specifically about backups using Arq, which @BLUEFROG has already indicated is not covered by ordinary Devon support. He mentioned earlier up this thread that @eboehnisch might have some experience in this area.

My more general backup system follows the 3-2-1 principle. My question was about whether it’s okay to tell Arq to back up, say, 3 separate Devon databases to one external hard drive, or would it be best/safest to keep one drive per database? I’ve already backed one database up to a single drive, and I can see that there’s a mass of folders inside (with titles like ‘379674F7C-2876-4640-97FA-01234’, ‘temp’, ‘largeblobpacks’ and so on) - I just look at it and wonder “Is this going to be workable if I back up a different database to this drive as well?”

Hope that’s better explained for you.

There should be no issue with it. In fact, I would think it much less efficient to emply one drive per database.

If you want to commit one drive to machine backups and another to specifically backing up your databases (yes, plural databases), I wouldn’t see any particular problem with that outside of making sure you are diligent in connecting both drives to do your backups.

What does “ok” mean here? Do you mean “Will Arq do it?” Or do you mean “Is that a good idea or might it cause problems?” Or something else?

Again: What does “best” mean? If you have only one drive, you have a single point of failure. If you have one disk per database, the risk of loosing all of them due to hardware failure or whatever might be smaller. OTOH, if you keep all disks at the same place and that burns down, it doesn’t matter now many disks you had. And using one drive per database is cumbersome and error-prone.

Why does it matter how Arq names it backup blobs? What does “workable” mean – you’re probably not asking if your disk will be getting dizzy by the weird names? But what are you asking?

Your questions are vague (to me), and if something is “safe” depends on many factors, not only the number of disks. Personally, I don’t care: I back up all my DT databases to one location, but I have several of those: Outlook, NAS, Backblaze. And, more importantly: A backup of my disk, also to different locations. Having backups of the databases is more a convenience, as it’s supposedly easier and faster to restore those separately.

1 Like

@BLUEFROG - that’s clear, thank you very much. Agreed re. efficiency with multiple drives, so glad that one drive won’t be an issue.

@chrillek - as you can see, Jim understood my question without issues, and was able to answer it clearly.

I also use Arq backup; with a single external drive backup plan, and a single cloud backup plan
There’s no “cross-intererance”

I don’t see the benefit of separate database backups

> I can see that there’s a mass of folders inside (with titles like ‘379674…
Don’t work with the raw data in your backups (or Devonthink databases)
They should only be used within the apps

He’s brighter.

2 Likes

Actually, I am using Arq in the most simple way: I back up all important folders — mainly my user folder, /Library, and a few others — to one vault on AWS S3.

I also make regular copies of the whole hard drive to an external SSD via Carbon Copy Cloner to have something should my Mac break.

Finally, I archive important data such as development files and DEVONthink databases manually to three places: S3, an NAS, and a USB stick that I keep in an out-of-building location. That’s less of a backup and more an ever-growing archive.

1 Like

If it helps, here is what I do…

  1. I use Time Machine in its default which backs up my mac hard disk and connect a USB drive once a week so Time Machine creates an archive on that. The USB stays in my house - ‘on site’
  2. I use Arq to back up specific folders, including the database folder (where all my DT databases are). This backs up to a OneDrive folder for now so it is ‘off site’. I am happy to live with the risk of using OneDrive rather paying for an Arq remote
  3. Each week, I use the Apple Script DT provide to create a ‘zip’ archive of all my databases to a Dropbox folder (it takes a few seconds to zip a 1GB database). This is a different ‘off site’ location so not everything is on OneDrive. I can live with this data being potentially a week old if I needed to restore from here.

The real questions for backing up are:

  • How vital is the data? (Surprisingly often not a lot - more a pain to recover from other sources. E.g. Bank statements can be got from the bank again if needed. Would it really be a disaster if you lost a load of old holiday photos?)
  • How quickly would you need the data if a disaster happened? (Choice of method to use)
  • Does it need to be up to the minute/hour/day/week recovered?

It is easy (on computers) to hoard lots of stuff without thinking
‘Do I really need this?’ I do an annual review to stop the pile become too large.

3 Likes

At least one of my banks keeps them online for a year only. After that, it’s on paper and for a hefty fee.

Thanks @eboehnisch and @saltlane

May as well share my own approach, in case anybody else passing by & reading find it an useful contribution (as the situation is prior to my question earlier in this thread). I do a full-system backup to Time Machine and a RAID drive (both in home) and then export all individual DEVONthink databases twice to various drives, one in the form of the database itself, and one as exported files/folders. Depending on the nature & importance of the file content, I then send them up to Dropbox and/or Backblaze to ensure at least one is off site. For Backblaze, it’s mostly as @eboehnisch says - more of an ever-growing archive rather than a fixed backup.

A lot of this comes from losing a lot of photos of my late grandparents (to whom I was close) a few years ago because I made the classic mistake of keeping everything on one drive - it was tough, letting that mistake sink in. Since then, I’ve taken a ‘belt-and-braces’ approach: for me, if my system is little cumbersome, that’s still okay with me. It’s just a matter of finding ways of refining and making the backing up process quicker and more efficient, including automation. This is the side I’m not so good at.

I also try to vary storage drives between different manufacturers, as I figure that if drives from one manufacturer is in any way predisposed to certain error(s), then drives from other manufacturers shouldn’t have that particular predisposition. Need to take care to research which company owns which drive brands, but that’s how I’ve been doing it.