Syncing with a big database


I’m evaluating using DEVONthink to replace Together both in macOS and iOS but I have a question related to database size.

My current DEVONthink database sync export is located in iCloud Drive and is about 130 GB of size. It indexes a lot of external documents located in iCloud Drive, then I assume when going from MAC to MAC (currently I manage 3 MAC) won’t be a problem with a good internet connection and as external files are in same location, different MAC will be able to open them (this is the way Together works without any problem -well, yes, seems it is a lot of data for that app and crashes and do other strange things, and very frequently move documents inside instead of continue with linked -indexed- files).

However, 130 GB is a lot for a iPhone/iPad, even if they are 256GB/512GB respectively.

My question is, before purchase iOS App, if it will sync or not, and if it will be able to manage so many amount of external documents when trying to open them, as they are located externally of the application but available in iOS iCloud Drive.

It should NOT be. You should never put your databases in any cloud-synced folder (including the Documents or Desktop folders if you are using the disk management feature in macOS 10.12 Sierra). Never. If you have, you must relocate them immediately or you could irreparably damage them. The safest location is a folder in your Home Directory, like ~/Databases

And no, Syncing to DTTG2 will not access the files in iCloud.

Jim, I could be wrong but I understood the post to mean that they are using the iCloud Drive sync store location to sync multiple Macs, and the databases have indexed documents from iCloud drive. The databases, again as I would interpret the post, are not located on iCloud Drive. Shouldn’t be anything wrong with this setup, except most importantly DEVONthink to Go does not support iCloud Drive as a sync location.

Sorry, I didn’t expressed well. What is in iCloud Drive is the export database, to be synced with other instances. Program database is in default folder and only touched by the program itself.

As I’ve read more, now I know iOS version does not syncs from iOS export database (that is only for MAC to MAC), then now I have other questions.

  1. My local MAC database size is about 130 GB. Will it be able to sync with iOS via network/WebDav/<other_supported_way> without trouble? I mean, it is 130 GB!!!

  2. Is the only way to put back in MAC program modifications/addition done in iOS via “manual” sync? What I would like to know, and at this level I’ve read both manuals (MAC and iOS) and still doesn’t understand well, is what is the rationale of the syncing process. As I understand now the process, you go with your iThings and do modifications or additions with them. Then you arrive office, run the desktop program and manually sync those modifications?

Thanks in advance.

Yes, that was my intention. What I din’t knew is iCloud Drive is not an iOS sync scenario.

PS, having to wait to approve messages is a little pain. :unamused:

Moderating new users is unfortunately the only reliable way to avoid spam (noone wants to see prOn here)

You know, it doesn’t have to be 130GB (nor do I advocate having singular giant databases like this). And as you are suggesting a potential problem with Sync, I second my point. It wouldn’t matter if it’s our Sync or Dropbox, etc. unless you are using a local Sync Store, this is a ton of data to be pushed over a network. Perhaps if you had two Macs on a gigabit or fibre channel network, both connected via Ethernet, but WiFi (or worse, the Internet)… you have right to be concerned. Will it finish? Eventually. Will it happen quickly? Likely not, unless everything just happens to work perfectly for hours.

No, manual Syncing is not the general operation. Sync works on an interval and will Sync as you work (if the Sync location is available and operational).

Surprised it could theoretically do it. But as you say, it is an incredible amount of data. I will use the Pro MAC version and break stuff in different databases and partially sync.

Got the “Aha” moment! :laughing: I think now I understand. I will give a try to the iOS version as well.

Thank very much!!

No problem. Here’s my view on database size…

Size in gigabytes isn’t the critical number. If you check out File > Database Properties > … for a given database, the number of words / unique words are more critical. On a modern machine with 8GB RAM, a comfortable limit is 40,000,000 words and 4,000,000 unique words in a database. (Note: This does not scale in a linear way, so a machine with 16GB wouldn’t necessarily have a comfortable limit of 80,000,000 words / 8,000,888 unique words.) So text content in a database is far more important.
If you have a database of images, it will have very few words but be large in gigabytes.
If you have a database of emails, it will have many words, but may be smaller in gigabytes.
The second one may perform more poorly as the number of words increases beyond the comfortable limit.

Smaller, more focused databases will generally perform better, Sync faster, and be more data-safe in the event of a catastrophe (avoiding the “all your eggs in one basket” problem). They also give you the opportunity to close unused databases when you’re not using them. This frees up resources, not only for DEVONthink, but the rest of the system. There is no benefit to having a bunch of unused databases open all the time.

Hi all!

Finally I ended with the full package: On The Go with the App Purchase, Pro Office + Agent!!


Because I like it a lot. Is the thing I’ve been years looking for.

After some tests, my current scenario are 6 different databases, one for each “interest”, synced across my Synology WebDAV with one store, and two WebDAV accounts, one internal to the intranet and other external when I’m out of office pointing to the same store. This way I can sync fast on home and slower out of home, as when in home the external sync account does not work and vice-versa.

Why WebDAV instead Bonjour when inside the network? Easy answer: this way I don’t need to have a “server” MAC more or less always on with the program running to act as home sync, more when my main MAC has gone from my old iMAC to a new MBP 2017.

I think is a win-win scenario, and the big storage option is inside the Synology, that is good for that.

My next step is a not painful but slow process to transform a zillion of scanned PDF to Scanned + Text thanks to Office OCR option.

Sounds like a good setup. Have fun with your OCR project! :wink:

Just curious, is there a way in DEVONthink to get a word count within a database?

Hi there,

just wanted to share my experience with large databases on iPhone/iPad:

Frankly, I’ve found that the “on demand” downloads are so fast even for a larger file that there’s no need to download everything. It’s fabulous to be able to access any of my +500GB files from anywhere through a 1-2min download (for very large files, way quicker for most).

Select the database, “File → Database properties…”

1 Like