DT, Indexing, Dropbox, Symlinks

parezcoydigo · June 23, 2010, 2:32pm

I’m watching USA-Algeria, and doing a little work to keep my blood pressure down while the game is on.

I know there have been a bunch of threads on using Dropbox, and the very real pitfalls of keeping databases in its folder. I corrupted a a database for one of my courses (luckily after the semester I was teaching it) by accidently opening it on a second machine. I won’t make that mistake again.

My thinking now is that I could keep my research in the file system (finder), create a symlink between that folder and a folder in Dropbox, and then index the folder with DTPO. This way, I would have individual indexed DT databases on all of my computers (currently 4 - work laptop and imac, home laptop and imac).

Are there any pitfalls using this method that cold corrupt the data? Are there any problems I’m not foreseeing? Anyone using dropbox in this manner?

Prion · June 24, 2010, 9:28am

Not sure what you mean. You can certainly put your research material in a dropbox folder and index that on all machines you are working with. However, any work done inside DTPO itself would not be synched through the dropbox thus leading to your databases drifting apart gradually over time. Not your research material though so depending on how much work you are planning to do in the Devonthink environment it may still be a viable alternative.

For this you would not need a symlink. On the other hand, using a symlink will not synch changes you introduce in one of your Devonthink databases to the other three.
I probably misunderstood you anyway.
Prion

Prion · June 24, 2010, 9:49am

With so many computers to work on, have you considered putting your mission critical data on a portable hard drive?
There are two ways to do that

copy all your data including the DTPO databases on the external harddrive and sync between it and the computer before and after work.
I use Chronosync (not free, but very generous update policy) to do that. You can instruct it to “dissect packages” to only sync the files that actually changed inside your database rather than the entire database. This way, syncing around 200 Gigabytes of data (my home directory in fact) before and after work takes between 3 and 5 minutes, the time I need to fetch a coffee or get dressed for the bike ride home.
You can move your entire home folder to the external drive. This way you will be working on exactly the same data everywhere and the time period before and after work is reduced to the time you need to log in your user account making use of the home directory on the portable drive.
I have worked with this for more than a year and only switched back to solution 1 because my MBP got a solid state drive and the techie in me cannot let this speed demon do nothing

I know none of this has the sex appeal of your data in the cloud but personally I think the cloud is not ready for research work.

Prion

parezcoydigo · June 24, 2010, 6:32pm

Prion–

Thanks for your reply. I won’t ever use an external drive for more than back up- or one that is permanently attached to a desktop for extra storage. I just don’t like that work flow. It’s an extra thing to keep up with when traveling (a weekly occurrence). I’ve also had external drives fail.

The issue is, there is no easy way (yet) to sync databases across computers. I love dropbox, but am fully aware of the dangers of potential data loss with cloud-based services. Keeping a dtbase there has the added problem of potential corruption caused by inadvertent simultaneously opened instances. The current approach does two things-- it keeps the storage of originals on my local HD. It also allows for cloud syncing. I have one machine that I use primarily. Any changes to documents that I make inside DT (assuming they original document was saved to my project folder) change the original file. And in fact, I make almost no changes inside DT anymore because of the ability to use external editors now. One other note-- I don’t like tagging files. I’m not a keyword tagger, so it doesn’t matter to me if the tags don’t mirror between the two databases.

If I need to access my database from my other machines, I can without fear of corrupting the original by having two instances open at once.

In the end, it seems to me that this approach (or the approach of keeping one’s project folder in dropbox and indexing that) overcomes the limitations of both indexing and importing.

macula · June 24, 2010, 8:26pm

All of which reminds us how essential it is gradually becoming to endow DT with built-in syncing capabilities. I realize this is a tall order in a database system of this kind, but I am sure these developers can live up to expectations if they decide to take on the challenge. Dropbox has a public API now, so in principle it should be possible.

I wonder, is this on the developers roadmap at all?

The feature would be ever more relevant now that DevonThink is about to transcend platform barriers (Mac, iPhone, iPad) and users will be accessing their databases anywhere and anytime.

sluman · July 17, 2010, 6:34pm

I’m wondering if anyone has used Devonsync and had success with that? I’ve tried, but have encountered some problems (i’m currently emailing with the developer to figure that out). The program claims to be able to sync either through your home LAN between computers or through idisk.

Also, what about the command line program, rsync? Anyone tried that? I saw a recommendation from the DT developers on this website that this is a syncing solution? I’m a little nervous though about losing metadata in that syncing process.

I’m traveling more these days, using a mac mini at home and then an older powerbook on the road. I am using DT Pro (not office) and it’s really getting annoying trying to keep the latest changes straight between computers.

If anyone has suggestions on Devonsync or can tell me about their experiences using rsync, I’d appreciate it. Also, someone on this thread recommended Chronosync, I’m checking it out now, but how does that stack up compared to the two previous ones I mentioned?

Prion · July 21, 2010, 10:12am

I have used unison but returned to Chronosync. It easily handles databases of 10+ GB but remember to select “dissect packages” when setting up. While the very first sync may well take longer, it typically runs 2 minutes or so for synching around 23 GB in Devonthink in total.
unison ran far longer and proved less flexible in handling sync conflicts on a case by case basis.

HTH
Prion

sjk · July 21, 2010, 8:34pm

Hard not to first think of Panic’s Usenet newsreader when I notice “Unison” mentioned in OS X related discussions.

michaelnau · July 22, 2010, 8:30pm

@sluman I haven’t used the current build of Devonsync, but the developer just put some more work in, so it is worth a second try since DT 2 (try him at @gajahduduk on twitter).
I can also suggest Chronsync-it is reliable and robust and can sync two machines over a network (adjust your sharing settings) as well as work station to external drive. From experience, Chronosync and multiple externals is the better option next to cloud servers and with scheduled syncs, there really is no drag on the workflow.

historydoll · July 28, 2010, 8:33pm

Is it possible to sync using SuperDuper?

sjk · July 29, 2010, 1:50am

SuperDuper! doesn’t support two-way synching of data modified in multiple locations; it’s uni-directional.

parezcoydigo · July 29, 2010, 4:34pm

I’ve tried everyone of the synching methods discussed above, and found all of them to have shortcomings for me personally.

After a month or so of using the symlink/dropbox/index method, I find I really like it. This has been accompanied by a few other changes in my workflow that are likely specific to the goals of my research. I still like DT as an amalgamator of my project information, largely because I find it’s search, see also, and concordance features really useful. But, with the switch to DT2.0, I found myself increasingly moving towards always using external applications for editing and document generation, which in 1.x I did in DT. It’s only a short step from external editing to indexing the external file structure. I mean, I was very happy to see DT2.0 move towards external storage of the files, but if something corrupts a database it is a serious pain to rescue the externally stored documents, which exist in a file structure that makes sense to the application, not to the human user. In the one case I’ve had of corruption, it was my own fault-- but the backups I maintained internal to DT got corrupted as well, and the only way out was to dig into the folders.

Indexing removes that risk. Indexing a symlinked folder in DropBox also gives me the synching I need. My computers exist in two different states, an thus cause problems for the chronosynch-type method. And, as I’ve said before, I’m not a fan of external drives for anything other than SuperDuper carbon copies.

Finally, the index/symlink/dropbox method also works with other aspects important to my current project-- using Oxygen to write validated TEI documents, TextMate for all my other note and transcription files (using the multimarkdown bundle), and their project management tools. And also, using svn for version control. In fact, I’m increasingly moving towards either git or svn as a repository, and indexing the checked-out working copy on each machine.

At this stage in the project, the search/concordance/see-also functions aren’t all that necessary (though I do use them to find names of individuals who show up in various places in my documents).

At any rate, soon I’m going to write up a description of the workflow as it currently exists as an update to my DT/Scrivener/Mellel series. Those posts still drive a ton of the traffic on my blog.

Xenophon · September 14, 2010, 9:44pm

I’ve been using Devonsync to keep my laptop and desktop in sync. It’s been working fine for me. My experience is, of course, limited to my own setup so here’s a bit about what I do (and don’t) do:
[list=]
[]I sync only over my local LAN (and hence only when I’m at home).
[]I sync only one-way at a time, never bi-directional. I don’t actually know whether this matters, it’s just an artifact of how I do things.
[]I always do a compare first, and look over the list of changes to see whether it appears sane.
[]I always empty the DevonThink trash on both machines before synching. Once again, I can’t say whether it matters, just that it works for me.
[/list]

One small gotcha with Devonsync is that it syncs between the selected database on each side. That means that you’ll be thoroughly messed up if you’ve selected Database A on one machine and Database B on the other! This is clearly documented in the instructions, but it still took me by surprise the first time I got it wrong!

radii0 · October 5, 2010, 12:10am

Have you found a way to synchronize changes done inside DEVONthing (creating a file, editing a file, etc.) with the indexed folder structure?

Thanks