DEVONthink Database Size

A DEVONthink database that holds Imported or Indexed content requires more memory than does Spotlight, because the DEVONthink database contains more information about the groups, documents and their metadata than does Spotlight. But that additional information is what makes the DEVONthink environment so much richer than the Finder environment. Ergo, I don’t want DEVONthink to do what Spotlight is doing. :slight_smile:

Artificial intelligence algorithms are at the kernel level of a DEVONthink database.

The most important measure of database size is not the database file storage size. The most important measure is the total number of words contained in open databases. The next most important measure is the number of items contained in open databases. For example, a database with a file size of 1 GB that holds plain text documents will likely be larger in total word count than a database with a file size of 200 GB that holds PDFs with a high percentage of images in their content.

As DEVONthink is a 64-bit application, it can have a large memory space. Problems begin to emerge when there’s not enough free RAM left, so that the computer moves into Virtual Memory to allow procedures to continue. In the Virtual Memory mode, data is moved back and forth between RAM and Virtual Memory swap files. This can result in slowdowns resulting from differences in read/write speeds in RAM and disk, especially as those speeds are orders of magnitude slower on a conventional hard drive. In the worst case, Virtual Memory swap files grow large and, if free disk space is used up, can pose the potential of overwriting existing data on the drive.

Typically, a computer will have the largest amount of free RAM right after a restart. Apple’s memory management is good, but not perfect. “Inactive RAM” is data that is temporarily retained in RAM because it has been frequently called or is necessary for a given application, and inactive RAM is subject to being replaced when free RAM is needed by an application. However, over time “crud” inactive RAM accumulates and sticks, tending to reduce available free RAM.

After a restart, my MacBook Pro with 16 GB RAM has about 10 GB free RAM (exclusive of inactive RAM) with my current suite of 7 open DEVONthink Pro Office databases, DEVONagent Pro, Mail, ScanSnap Manager, Messages and several other apps open. At this moment, there’s a bit more than 5 GB free RAM available, so all is well. The computer is operating at full speed, with no pageouts. Day by day, as I continue work, the amount of free RAM will continue to diminish and if I were to take no action would reduce to the point that pageouts occur and Virtual Memory swap files start growing—the spinning ball slowdown indicator would appear.

I monitor free RAM. When it drops to about 1 GB I’ll take action. I can recover most of the free RAM by quitting and relaunching apps, or by a restart. As my MacBook Pro has a 500 GB SSD, restarts take less than a minute, so are no longer to be feared. Remember, too, that ay errors that have accumulated in the computer’s memory will be cleared by a restart.

RAM is good. More (free) RAM is better!

I really don’t need a Mac Pro with the maximum currently possible 128 GB RAM! But I sometimes wonder just how large a DEVONthink database set could be run on such a beast.

So Bill, what’s a good upper number of words. I’m currently working with a 4.7 gB database with 3.5 million words on an iMac with 8 gB of RAM. Seems okay. I just wondered how big to let it grow?

3.5 million total words is a small database relative to memory requirements. You must have a lot of PDFs, which are often storage space hogs in terms of their text content density.

The amount of free RAM on your computer depends of course on how many other databases and apps are open and their memory requirements. As I noted, over time inactive RAM may accumulate that “sticks” and reduces available free RAM. Quitting open apps temporarily can increase available RAM, and restarting the computer will always free up such “sticky” inactive RAM.

You could probably grow the database up to tenfold before hitting persistent slowdowns caused by running out of free RAM.