How big is TOO big?

I use DT Pro with 9 databases. My main database is 6,5 GB big and holds more than 77.000 documents. Transferring something from the Inbox to it takes minutes every time. (The other databases are more specializied and considerably slower, so the transfer goes quicker, but still far from being quick!) Within a database, however, everything concerning moving items to other folders goes smooth and fast.

I have an iMac from 2007 with 3 GB RAM, running OS X 10.6.8.

Have I touched capacity limits here? What should I do in order to speed up things?

Thanks for any hints.

In terms of performance on your Mac, the most important measure of database size is the total number of words among your open databases (see File > Database Properties), and the most important resource databases are using is available free RAM. When all free RAM has bee exhausted, your computer begins use of Virtual Memory to allow procedures to continue, and that involves heavy swapping back and forth of data between RAM and Virtual Memory swap files on disk. Because read/write access on your disk is orders of magnitude slower than in RAM, the computer slows down.

When you open a database, the index information and metadata are loaded into memory. The document files themselves are only loaded when displayed. Your main database when opened would require more physical free RAM than is available on your computer, so you will see slowdowns. In fact, it’s larger than any database I’m running on my MacBook Pro Retina with 16 GB RAM.

I assume you are running in 64-bit mode. If you can live with the performance of your main database on your computer, fine. If not, the alternatives would be to move to a computer with more RAM, or to split the database into topical segments, each of which can be opened or closed like information Lego blocks.

My own main database that I use for most research and writing contains about 40,000,000 total words, which is comparable in size to the Encyclopedia Britannica. I normally have several other databases open, each of which meets a need or interest and that bring the aggregate of total words to about 50,000,000. That’s a lot of information! But DEVONthink Pro Office runs quickly on my laptop.

Quick question – why would having a lot of PDFs in a database impact performance if they’re not previewed or displayed?

The files themselves, PDF or other filetype, are not loaded into memory unless displayed.

But what is loaded into memory is metadata about those files, including text index information, the Path to each file (whether Indexed or Imported), group location, tags, etc.

So more memory resources are required for a database that holds many documents, especially those with many total words.

The PDF file format is inefficient in data storage density for text content, compared to plain text files, rich text files or HTML files. Especially in the case of PDFs of scanned documents where an image of the original copy is retained, the file size of a PDF will be orders of magnitude larger than a plain text file that contains the same text information. To illustrate this, select a PDF in your database and choose Data > Convert > to plain Text. Compare the file size of the two filetypes, both of which contain the same text information.

For that reason, a database containing only plain text files and with a file storage size of, say, 50 MB may be larger in database “size” (memory resources needed) than a database comprised of PDFs with a storage size of 1 GB. That’s why we emphasize total number of words and number of documents as the most important measures of “size” for DEVONthink’s usage of computer resources.

Of course, if one is using Sync via the cloud, the storage space of the database’s files do become important, as uploading/downloading the 1 GB database in the example above will take longer than the 50 MB database.

Thanks for the clarification.

Is the solution to this problem just throwing more resources towards it (e.g. More RAM), in which case, not an issue as I’m aiming to get a 8GB/16GB in the near future?

There’s an old saying: RAM is good; more RAM is better.

Your main database with about 77,000 documents might stain 8 GB, and perhaps even 16 GB unless you monitor free RAM carefully. It’s always possible to enlarge databases beyond the ability of your Mac to run them at full speed.

In which case, database design–optimizing the sizes of database to run at full speed within the available resources–should be considered a practical measure.

Thanks, Bill.

I looked it up: The Big One has 77 million words (1.3 million different words).

I’ve put all RAM in that my iMac allows, and I don’t want to buy a new one, because I’m happy with the thing and don’t fancy the newer OS Xs (Lion and beyond) too much. So, I guess I will start a new standard catch-it-all database and move in from the old one the things that might become useful.

BTW, I’m still running DT Pro 2.4.2 - will updating increase the memory problem? Soften it? Or make no difference?

The quick answer to that question is that updates to DEVONthink don’t really address the issue of memory resource issues; database that’s too large to leave some available free RAM will have performance problems.

There’s a longer answer, noting that there can be reasons NOT to upgrade to the most recent version of DEVONthink.

There have been a number of free maintenance updates since 2.4.2. The current release is 2.7.4. And of course Apple has released three upgrades of OS X since the version you are using, and we have to issue updates of the DEVONthink apps to keep them current with new releases of OS X, as well as to correct bugs and introduce new features from time to time.

At some point in updates/upgrades of DEVONthink, in order to keep up with the most recent versions of OS X the application may require a version of OS X that is newer than the one you are running, and/or a CPU that is newer than the on on an older Mac. At that point, stay with the most recent version of DEVONthink that runs properly under your version of OS X and on your Mac hardware.

In that case, the user should stop installing new updates to the DEVONthink app, as they no longer work properly under the older version of OS X, or (in the case of older CPUs, cannot take advantage of DEVONthink in 64-bit mode, or run on a non-Intel CPU). That’s why we continue to offer “legacy” versions of the app, that can run on older hardware and earlier versions of OS X.

Thank you!