Correct rsync syntax for (unidirectional) database syncing

After a few recent scares with DevonThink Sync, I decided to turn to rsync for transferring my database between my laptop and desktop. (Also, as Christian recently confirmed, DEVONthink sync does not currently synchronize the order of items within a group, which for me is significant.)

rsync is more than adequate for my workflow, given my longstanding habit of closing the database at the end of a work session, and never updating the database on both machines between syncs. All I need, in short, is a speedy way to update the old version of the database on the target machine, by transferring only its diff from the updated version on the source machine. So rsync seems to me the obvious tool for the job.

I understand that metadata/xattrs/resource forks and possibly permissions are crucial for the integrity of the process, so I decided to show you the parameters that, based on a reading of the man pages, seem to me appropriate. If anyone has tried this experiment before, I’d be thankful to know if this looks correct.

It’s very possible that some of these params may be unnecessary but I thought to play safe.

rsync --verbose --recursive --links --perms --executability --specials --times --dry-run --progress --extended-attributes [source] [destination]

Thanks in advance.

I really think this will prove to be a very inefficient thing to do. (Plus, a mistake with rsync can destroy data. I once deleted three years worth of files because I had a misplaced forward slash.)

Inefficient in what way, other than being prone to human error (a point well taken otherwise)?

rsync used to be an “official” recommendation:

That’s a pretty old post.

rsync doesn’t magically know what to sync or not. It has to determine what has changed. This means it has to go poking around asking, “Is this changed? Is this changed?”

I am doing an rsync -aE on a relatively small database i[/i]. Of course, the first rsync is slow but even just moving groups in my sort order (your mention) will lead to another long rsync process (10+ minutes easily).

True, that was a dirty trick, pulling out such an old post. :smiley:

10+ minutes for such a tiny diff is exorbitant! I just tried a local sync (on a single machine) using your syntax and with the same sort of diff—a single item nudged to another position in the same group. It took less than 90 seconds for a 2.3GB database.

Assuming that I’d script or alias the full command, including the -aE arguments and database filenames, thus avoiding catastrophic typos such as the one you mentioned, so far I am not convinced that this is inefficient. :question:

PS: But to be sure, I need to try via the LAN. :confused:

Syncing locally on a single machine is (1) going to be faster, (2) kinda not the point of rsyncing.
Network and connected drives will definitely perform worse. (My testing was to a USB3 drive - directly connected, so no USB2 drop from a hub.)

You’re right, Jim, it’s unbearably slow. Not a viable option. Thanks.

No problem. I love me some rsync :mrgreen: but for this purpose it just isn’t really effective.

No time at the moment for implementing this, but I imagine a scriptable scheme in which both machines would have two copies of the database, the “original” on which one starts working and the “current” one (which departs from the “original”).

The idea is to circumnavigate the part of rsyncing with the greatest overhead, i.e. diff-ing between different hosts. To do this, rsync can be used locally to produce a patch between the “original” (previous) and current (modified) versions. Only that patch would then be copied (via scp, not rsync) to the other host, whereby it would be applied (again using rsync locally) on that host’s “original” to produce its own local “modified” version. And so on. I am attaching a diagram to make this clear.

The rsync arguments for producing and applying patches are, respectively, --only-write-batch and --read-batch.

[Updated with image scaled down to fit.]
File 01-10-15 12 16 19-1.jpeg