Slow database initialization

Hi,

I am running latest version of Devonthink Pro Office on a Macbook 2.4GHz intel core due with 8GB ram.

When launching Devonthink to open a 22GB database consisting mainly of PDFs and images, it takes, after a reboot (with all memory supposedly fresh and nice), about 4-5 minutes for the initialization of the database to finish.

Is this considered normal and ok?

Is there anything I can to to optimize launch/initialization speeds?

Sincerely,
Frode

22 GB is a lot of data – over here at start of DEVONthink Pro Office start up I’m normally launching 5 or 6 databases with a combined total of 13 - 14 GB (about 60% of your load). Launching that on a MacBook Air with 8GB and an SSD, it takes 15 seconds for DEVONthink to come up and load those databases. Even adjusting for SSD vs. a standard drive, I would expect faster than 4 minute launches with your data load.

Are you running tight on space on your machine overall?’
Is there anything else loading at startup that might be a hog?
Have you checked Activity Monitor right after launch to see what’s taking up RAM and disk resources?
Is it feasible to trim your database size by splitting it into 2 or 3 standalone databases?

It’s easy to create a test database with, say 1 GB of data and another with more data and test the launch times with DEVONthink just opening those database. Though your database is very large, and that could be the main factor, the ultimate answer will depend on what else is going on with your machine.

While this has not helped on all cases similar to yours, it helps sometimes: Try if deleting DEVONthink’s cache helps. (Menue bar > DEVONthink Pro Office > “Empty cache”).

And: More than the pure size of the database, there is a maximum of documents and words, that a single database should hold, at least as a recommendation. I think it was around 200thousand files and 3million words.
You can see the information about these values in the “database properties”. Maybe you want to report your word and document count here, so that the experts here can give you advice, if splitting your database should be considered.

Thanks for the replies!

The database consists of only 15 000 files, but OCRed images.

It contains 450 000 unique words and 3 991 814 words total.

I’ll try going through the suggestions you guys have and see if it helps.

f.

In my case, my database is 12.2gb with 4400 files (half of which are PDFs - many OCRed and indexed). It has 1.3 million unique words and 61.5 million total, running on a 2009 macbook pro with 4gb ram.
I noticed a substantial slowdown in initialisation when I updated to mountain lion (or perhaps to osx lion…). I don’t think I updated Devonthink at the time so my conclusion was that Devonthink simply initialised more slowly in the new OS. I also noticed, not anymore though, that certain devonthink processes remained active after quitting (don’t use the sorter).
I’ve just timed how long it takes to start: 5 sec.
This is after having recently quit the application however. When I start it up for the first time after restarting the computer, it takes something like 30 sec. I didn’t notice any difference in start-up times before the OS upgrade.

joao.

The most important measure of database size is usually the total word count, not the file storage size.

Your database with 61.5 million total words is larger than I would usually try to run on a Mac with 4 GB RAM. (My rule of thumb with 4 GB RAM is to usually hold the total word count of all open databases to 40 million words or less.) I suspect you encounter slowdowns and the beach ball during operation. That’s because when free RAM is used up, the computer begins swapping data back and forth between RAM and Virtual Memory swap files. Because disk read/write accesses are orders of magnitude slower than in RAM, processes slow down.

You can check the status of RAM and swap files by launching Activity Monitor (Applications > Utilities > Activity Monitor.app). Click on the System Memory tab. Low free RAM and lots of pageouts would indicate current or impending slowdown.

When performance becomes unsatisfactory a Restart of the computer can speed it up again, at least temporarily. There are utilities that can remove “crud” inactive data from RAM, and so provide more available free RAM – rather like removing a blockage from an artery. I use C0cktail for OS X maintenance, and it has a routine that can do that.

You might consider splitting the database into two or more topically designed databases, as each would probably fit more comfortably into the RAM space on your Mac and reduce the potential for slowdown. I treat such topical databases like information Lego blocks that can be opened or closed as needed.

If your memory is already strained, the safest approach to splitting a large database would be to select the content that is to be moved, choose File > Export > Files & Folders, and create a new folder to hold the exported items. When the export is complete, delete the still selected items that had been exported.

Next, create and name a new, empty database and Import (File > Import > Files & Folders) ALL the contents of the folder that holds the exported content.

Note that when DEVONthink opens a database, it must load into memory information about the text index and other metadata such as groups and tags. But it does not load documents into memory unless there were documents open in their own windows when the database was last closed. If large documents are loaded into memory when the database is opened, that would increase initialization time.

Thanks for the reply about Devonthink’s memory requirements Bill.
I never actually thought DT required so much ram, but I think it helps explain a few things.
I tend to do a lot of work with memory-intensive apps (CAD, GIS, 3D and GFX). I noticed when I upgraded to Mountain Lion, all these apps seemed to be more memory hungry than in Snow Leopard. I am assuming that Mountain Lion (whilst much more stable than Lion) is much more memory hungry an OS than SL. This probably drove a few apps (including DT) just over the edge.
On the other hand, I still think that there might be some issues with DT running under Mountain Lion. This is because, whilst the initialisation tends to be much slower when opening DT for the first time, once it opens it runs at full speed - I don’t see any lags or beachballs at all.
I’m actually quite happy with its speed in a 4GB machine. Just a bit unhappy with how long it takes to start up - and this is a reflection of the type of program DT is. It’s an acceptable delay in startup for specialised apps, less acceptable for what it is in effect a Finder replacement which you want to start quickly when you need to clip something, etc… Perhaps this can be fixed with having the sorter on all the time? Maybe it takes up less memory in the background? - if so, I wish I could have the sorter on to clip documents to devonthink quickly, without it actually being open on the screen (the tab gets on the way of virtually every app but the finder).

On a side note, I’ve been eyeing with interest the new Devonsphere. Would you say its a good replacement for DT (fixing the issue of it being a jack of all trades and master of none (but search)), and would you say it can handle such a database on a 4gb machine (or am I better off just having DT open most of the time)?

joao.

My own practice is to keep DEVONthink Pro Office and the set of databases on which I’m currently working open all the time.

I like to work on laptops. Until fairly recently 4 GB RAM was the maximum available, and I had to tailor my habits to that limitation.

I’m in hog’s heaven with my MacBook Pro Retina, with 16 GB RAM, i7 CPU and 500 GB SSD. Fast, fast, fast and currently 9 GB free RAM headroom and 0 pageouts.

Back in the day, adding RAM was expensive, especially at Apple’s prices. The base RAM of the MacBook Pro Retina I ordered was 8 GB. Kicking that up to 16 GB added $200, which I chose to do. I remember paying that much to add 2 MB RAM to a PowerBook! And I paid more (in 1993 dollars) for that PowerBook than for the new MacBook Pro.

After a few days of trying this and that and then some … I’ve come to simply decide to keep my database open most of the time. I still have very long initialization time, but since I experience no noticable lag while actually using the application, I’ll finish this project without stressing any more about this.

And thus, life is good.

Thanks for the responses!

But … I shall return … with an array of frustrations regarding image-rotation, OCR-whatnots and viewing-garble. So rest now while you can …

I have recently started running DPTO on my MacBookAir 4gig ram but with the databases on a USB 3.0 thumb drive.

I have started with just my small databases (2 to 4 gigs). They run blindingly fast. I am amazed. But why? Am I to understand from Bill’s answer above, that DTPO loads the DB into RAM?

And, can anyone guess what will happen if I start running my 22 gig DB from a USB? I assume it will be slow. Planning on buying a 64gig Kingston HyperX. The fastest USB I can find. But if it will be painfully slow. I don’t know. Already looking on how to split it up into smaller more focused databases.

Thanks. Just trying to understand better.

cheers!!!

Ryan Nagy