VERY slow database opening and unresponsive db in DT2.0

devamag · April 4, 2009, 11:42pm

I’m a having a problem with a VERY slow opening DevonThink 2.0, 1.6GB database, containing about 120,000 items on a 2.16GHz MacBook running OS 10.5.6 with 1GB of RAM. It can easily take five minutes to be open and fully useable. This was a problem when the database was in DevonThink 1.0 but I’ve verified, repaired, rebuilt and optimized. Then exported from devonthink 1 and imported into devonthink 2. Then did all the same to the new database. Still very slow.

One odd thing that happens: The item count beside several high level folders continues to increase for several minutes after opening. At first I thought this was RSS feeds gathering new items, but now I don’t think so. Once the database is open a while it does become somewhat more responsive, but not very.

What’s the problem here? Limitations of DT2 or database size? Hardware/RAM limitations? Other? Thanks for any clues.–John

Update: Running Verify & Repair recently, DT2 did find some errors (approx 24). The repair process in DT2 seemed to drop out after 1 or 2 files at each effort to repair. Had to run Verify & Repair several times to get down to 8 errors. Now I have 8 errors in the database that can’t be repaired and I have no idea where to find them or whether they are (a partial) the cause of the slow down. Still takes a LONG time to open but DT does seem a little bit more responsive when navigating the database (or maybe I’m just hoping).

Bill_DeVille · April 5, 2009, 2:59am

The amount of free physical RAM is the ultimate limiting factor in performance as databases grow large. Procedures such as search, See Also and Classify involve memory-intensive operations. These days, 1 GB of RAM is “only” 1 GB of RAM. Some current Mac laptops can hold up to 8 GB RAM. (Gee, I remember spending hundreds of dollars to upgrade my Mac Portable to 4 MB RAM, in preparation for an Egyptian project.)

The reason memory-intensive operations slow down when they run out of available RAM is that Virtual Memory comes into play. VM swaps data back and forth between RAM and disk. Of course, disk reads and writes are very much slower than reads and writes in RAM, so there can be perceptible pauses.

The same database created in DT 1 should be much more responsive in DT 2, because of differences in the database structure, which reduce memory requirements.

But I suspect much of the problem you have relates to database errors. If non-correctable errors are found by the Verify & Repair procedure, the database needs maintenance, as matters will only get worse otherwise.

Before working with a damaged database, I often recommend making a compressed copy of it, as a resource that can come in handy if repair attempts really mess up the original. Caution: Always Quit DEVONthink before making a Finder copy of a database, as the copy may be incomplete or have errors.

Now that you have a zipped copy of the database, launch DT 2 and choose Tools > Rebuild Database. The idea behind this is that the database will first export all its contents (groups and documents), then bring them back into the database. Any files that may have errors will likely fail the export/import procedure, and they will be listed in the Log. Note: If there is a list of ‘failed’ files, save the list for future reference.

After the Rebuild, does the database seem OK? Inspect it for any obvious problems such as lost groups, etc. If it seems right, run Verify & Repair, If there are no problems run Tools > Backup & Optimize.

Remember that zipped copy of your database? It may come in handy if you have a number of files that failed to be included in the rebuilt database. They should be in the Files.nondex folder inside that database. Copy that Files.noindex folder to, e.g., the Desktop and start mining it for your missing files. I would suggest that you capture them into the rebuilt database in small batches. Look at the Log after each batch to see if one or more failed to import, perhaps because of damage. Periodically, run Tools > Verify & Repair to make sure you haven’t reintroduced the cause of problems.

Note that when new data is added to a database, if you have turned on Spotlight indexing Spotlight will likely go to work indexing the new content.

I doubt that the slowdown is related to RSS feeds, unless you have a great deal of stuff being dumped into the database, triggering Spotlight indexing as well. As a check, turn off Internet access (WiFI) for a while and watch performance.

devamag · April 5, 2009, 4:15am

Wow. Thanks for your thorough and helpful response. I hope it does others some good as well. Thanks!

devamag · April 8, 2009, 2:10pm

I’m really frustrated. I rebuilt my large database (described below) and now DT is getting some very strange behavior. Honestly, DT has never been very stable for me and I’m on the edge of getting rid of it because I MUST have a stable data platform for work purposes. I hate to do that cause of what DT is capable of. Here’s the current problem…

I rebuilt the large database. DT exported and then imported. Now the large database opens EVERY time DT opens EVEN THOUGH a separate smaller database is the default opening database. Stranger, it opens in the background; that is, the verifying dialogue comes up with the name of the larger database and then the database may or may not be visible.

Then I intentionally open the database and it then becomes visible.

Worst, all of DT has slowed down to a crawl, with spinning color wheel coming up for almost every activity.

Should I reinstall DT? Rebuild yet again? Other?

cgrunenberg · April 8, 2009, 2:18pm

The preference “Startup > Open windows that were open on quit” opens the last used databases too. Therefore either use a different option or close the large database before quitting, e.g. via the sidebar’s contextual menu.

But the only solutions for your performance problems are…

add more RAM (1 GB is not much on Leopard)
quit as many other applications/processes as possible
split the databases, use only one concurrently

Bill_DeVille · April 8, 2009, 4:33pm

In public beta 4, DT Pro/Office Preferences > General - Startup “Open windows that were open on quit” results in opening the suite of databases that were last open When the application was last quit.

There have been many user requests for this behavior.

That database was already open. File > New window > [database name] would have revealed it.

Assuming that your databases are in good shape (which can be checked using Tools > Verify & Repair), slow performance indicates that your computer has run out of free RAM and is heavily using Virtual Memory.

There are several measures that can be taken in such circumstances.

Install more RAM. That old saw is true. “RAM Is good; more RAM is better.”
Adjust the computer environment by reducing the number of open applications. Restarting the computer will free up RAM. Quitting, then relaunching DEVONthink can free up RAM after a series of memory-intensive Operations.
Use topically designed databases to reduce their size for optimum performance on the computer.

My ModBook (a custom Mac tablet based on a MacBook) has 4 GB RAM. That amount of RAM helps responsiveness not only for my DTPO2 databases, but also for photo or video editing and so on.

Even so, I would experience poor performance were I to merge all my DTPO2 databases into a single very large one. I would also lose the improved focus of Search and See Also operations that results from good topical design of databases.

My main DTPO2 database holds more than 25 thousand documents and more than 35 million total words. It is a topical database reflecting my professional interests in environmental science and engineering, policy issues and laws and regulations.

This is, of course, a pretty broad topic. Contents range across a number of scientific disciplines, from physics and chemistry to geophysics, hydrology, atmospheric sciences, ecology and toxicology and health effects. Policy issues often involve political science, sociology and economics analyses. I often compare legal and regulatory differences in the USA, European Union and some less-developed areas.

I have a second environmental database that deals with specific topics such as environmental sampling methodologies, chemical analytical methods, environmental data evaluation, quality assurance, risk assessment methods and so on.

I’m spoiled. I expect most searches to take milliseconds, and See Also or Classify suggestions to pop up immediately. Were I to merge those two environmental databases. I would encounter performance lags.

But performance isn’t the most important justification for splitting my environmentally-related content into two databases. When I’m researching health and regulatory issues related, for example, to mercury contamination in fish, I don’t want to be deluged with hundreds of references to sampling, analytical and data evaluation procedures.

Topical design of my databases makes my use of them more efficient and productive.