Automatic renaming of dtSparse to sparseimage and unintended consequences

With DT 3.5.2 DEVONtech implemented a workaround for a problem which caused automatic unmounting of databases by TimeMachine. This workaround consists of renaming encrypted databases on opening - from .dtSparse to .sparseimage. This has unintended consequences, the least severe of which is that if DT terminates irregularly, it is necessary to rename the databases manually before they can be used by DT again. This is the least severe, because DT is very stable and - in my case - occurrences which crash macOS are rare. However, anybody who has specifically selected databases for backup by a 3rd party program (such as Arq) may be in for a surprise: if - as I did - you select your closed database (let’s call it database.dtSparse) for backup, then the database will only be backed up if it is closed. If you selected it for backup whilst it was open (and therefore called database.sparseimage) it will only ever back up again when it is open - which could mean that whilst you thought you always had up-to-date backups, you might not have. There are workarounds for this problem (adding the whole folder containing databases to the backup, or adding each database twice [that is once as .dtSparse and once as .sparseimage] but both these solutions will take up twice as much backup space and may have other disadvantages. If the database is renamed automatically during a backup, an error message is the best of the possible outcomes.

Whilst I trust that this approach was the only available solution open to DEVONtech, I am reluctant to condone software changing file names without user input - and would add my voice to any others asking you to walk this change back; I understand that a solution was required for a subsection of users, but to me this looks like putting other users (and/or their backups) at risk.

1 Like

The change was made to combat a more egregious error: Catalina spontaneously unmounting encrypted databases when Time Machine was doing backups. Short of Apple addressing rhe issue (i.e., not going to happen), this has been the only way we have found to reliably resolve this issue.

1 Like

I feel that the gravity of each error is open to personal interpretation - whilst at the same time I acknowledge your right to take a choice based on what is likely to be a good deal more information than is available to me. My current interpretation is that this decision is a case of “Verschlimmbesserung” - a wonderful German word meaning to enhance something and in doing so make it worse. The reason for this interpretation is that for the common user the change is not transparent - i.e. has hidden consequences (I leave some of my databases open most of the time - had I not discovered this problem, I would have forgone weeks of backups). As such, an apparent problem for some has been turned into an occult problem for others; I cannot currently judge the likelihood of corruption or failed backups (e.g. caused by the name changing during backup), which to me is a very serious issue.

I would ask you to make the potential problem associated with automatic name changes more obvious to users - although, I admit, I’m not sure what the best way to do that is; a message when setting up encrypted databases could be an option, but would not help the current user base.

I would also welcome any input anybody has into what happens to the backups of my databases when they change their name whilst they are being backed up by TimeMachine, CCC and Arq.

1 Like

Afaik, Apple use snapshots. If you change a file inside a bundle while a backup, the backuped file is the previous one, not the new one.

Related issue which I never truly have gotten a handle on…

If I run my Mac continuously (as I use the DT3 server) and use both Time Machine and Arq for backups, are my databases regularly backed up? Or do I need to periodically quit DT3 and reopen or verify the databases or in some other way force DT3 to update the database file?

If I run my Mac continuously (as I use the DT3 server) and use both Time Machine and Arq for backups, are my databases regularly backed up?

Yes, as long as the databases are in a location being backed up.

Or do I need to periodically quit DT3 and reopen

No, this shouldn’t be necessary

or verify the databases

You should be doing a routine Verify & Repair on your databases regardless.

or in some other way force DT3 to update the database file?

I don’t know what you’re referring to here.

Thanks

Yes - I do the Verify/Repair about weekly - but wanted to be sure it is backed up ongoing through the day when Arq and Time Machine do their backups

The date stamp for last modified on the databases do not change except if I shut down or Verify the database - you are saying that Arq/Time Machine are nonetheless getting the most recent changes even though the date stamp of the database file does not reflect ongoing changes?

Though I am not running Arq here, Time Machine takes a snapshot of the system at the beginning of the backup, so it should back up things in their current state. I can’t tell you if TM is only using modification dates as deltas since the database size is also changing. But as @cgrunenberg points out, it’s not a bad idea to close DEVONthink for backups.

ok thanks - probably best to close them just to be sure

Isn’t this an issue for any software that runs full-time on a Mac using Time Machine? Is it possible that Apple just doesn’t think of their machines as servers in the same context that Windows very meticulously addresses this sort of issue and thus it is an unrecognized flaw in the backup routine of many Mac users?

As I understood earlier this issue only affects backups of encrypted databases that are written by TM to a network volume, not to local volumes:

Possible? I suppose so, however I don’t think it’s the case as many people use Time Machine without incident. I have never had an issue with it myself. The only issue I’ve ever had was with external drives finally getting old and dying - but that’s easily resolved by staying on top of hardware.

That is correct.

Would a TM backup of a sync store on a network drive be possible?

I’m not sure if I understand your phrasing.

You want to backup a local sync store to a networked drive?

Haha :slight_smile: The number of permutations for backup constructions is getting quite high. so I agree I should have explained it in more detail.

If TM has trouble backing up encrypted backups to a network volume, another workaround might be to backup a sync store of that encrypted database.

macOS — DT
|       |
TM      |
|       |
NAS (sync store)

If that works a user might safely exclude the encrypted database from TM. But perhaps TM doesn’t backup data on mounted network drives.

TM will back up to a mounted network drive, e.g., a Time Capsule, NAS volume, or even an external via AirDisk (though that is slow).

Almost there :slight_smile:

I mean it the other way around. Backup with TM using data originating from a network drive (i.e. containing the sync store data). But I wouldn’t be surprised if that isn’t possible with TM.

(Unless of course you’d setup another macOS device as a Mac server (which I think few people will do), somehow enable WebDAV and have DT on another device sync to that WebDAV server running there).

Thanks for that - both TimeMachine and Arq work with snapshots as far as I know. That is about the extent of my knowledge, however. The Arq-logs would suggest to me that the snapshots are created immediately prior to performing the backup - is that correct? I remain unsure whether that mitigates the problem of name changes during backup - the question being whether the snapshot remains static (I guess) or whether files renamed before their snapshot is backed up end up actually having their snapshot removed.

The act of renaming a file is equivalent to deleting one file and producing another. Initial experiments with both CCC (with SafetyNet on) and Arq (with “keep deleted files in subsequent snapshots” on) suggest that both programs keep the appropriate snapshots; TimeMachine I’m not yet so sure - I’ve come across different modification times which I cannot yet explain (e.g. snapshot 14:00 today shows database.dtSparse to have been last modified today at 11:00, but the snapshot at 15:00 shows the last modification today at 09:00). Any program which deletes backups (snapshots) of a file when that file is itself deleted would effectively not keep useful snapshots of databases if DT were regularly opened and closed. I don’t yet know whether this is a mere theoretical consideration (the programs with which I work seem to treat modification and deletion of a file in an identical fashion) or whether some backup systems (can be set to) work that way.

I’m going to reiterate what I said earlier on: I entrust DT with really important data - and secure it by using 4 different backup modes in addition to syncing across several devices. I was quite happy that I had a safe and secure backup set, but this change actually has me worried. Whilst I think it is always advisable to test backups by restoring and checking files, none of my routines to date would have detected files being deleted due to being renamed. I’d find it hard to fault a user who lost data because they hadn’t realised the potential consequences of an effectively hidden repeat name change.

I’d like to point out that my worries may be unfounded; I just can’t tell yet. There are certainly theoretical considerations which worry me - I cannot yet evaluate whether or not these considerations (can) translate to practical problems. @BLUEFROG was this question (“can repeat name changes influence backup strategies?”) something which DT looked into (I ask this without even a trace of malice - I’m looking for reassurance here), and if so, what were the results?

Might I ask what is the purpose of using an encrypted database if it’s constantly or mostly decrypted? Isn’t that device using whole disk encryption?

And can’t CCC create a versioned backup of a sync store on a mounted network drive? Alike my suggestion with TM that probably won’t work (easily).

I can’t answer your second question; it’s an interesting proposition, which I need to think about.

As for your first question: I’m paranoid. Yes, the disk is encrypted too. On a more serious note (no, really, the idea of losing clients’ data scares me), when I use Arq to back up an encrypted database the data never leaves my device in an unencrypted form (I assume this is a function of the snapshot made immediately before the backup - the snapshot is locked, ie encrypted). That means that whilst I also use encryption in Arq too, I don’t have to trust its AES implementation blindly; Arq never actually “sees” an unencrypted file. The same goes for backing up to Blu-ray.

(I’m not qualified to evaluate whether or not encryption has been properly implemented - which is why I love DT using a non-proprietary implementation; it seems reasonable to assume Apple’s implementation of encrypted sparseimages is safe, or at least industry standard, which is what I would be measured against if I lost data).