How big can be the size of the devonthink database ?

Can I use the devonthink to manage files on my mac ?

Basically dump all my movies/songs in there . Is this an advisable usage ?

Since the files are stores in the database not accessible outside of devonthink (is this correct ? ) is it safe ?

DEVONthink 2 doesn’t modify files/documents contrary to v1.x, all files/documents are stored inside the database package (DEVONthink Pro) or in the database folder (DEVONthink Personal). This is as safe as the filesystem of course.

In addition, filesize doesn’t matter but more than 100 thousand items and 100 million words per database are only recommended with lots of RAM and a fast computer.

thanx u , i think i will use together for file organization and devon for research work :slight_smile:

What’s generally not considered “lots” and “fast” nowadays? Maybe under 4GB RAM? But every Intel Mac seems fast relative to my older G4/G5 systems. :slight_smile:

As a practical matter, “lots” and “fast” boil down to what one considers acceptable performance when working with a database on a given computer.

I often say that I’m spoiled, as I want most of my single-term searches to complete in 50 milliseconds or less, and I want See Also suggestions to pop up instantly. I never want to see a spinning ball, which would indicate that free physical RAM is no longer available unless Virtual Memory starts swapping data back and forth between RAM and Virtual Memory swap files on disk. That slows things down, as read/write to disk is orders of magnitude slower than in RAM. So available free RAM is much more important than CPU speed to provide acceptable database performance.

I often switch between two computers, a ModBook (based on a late 2007 MacBook) with 2.4 GHz Core 2 Duo CPU and 4 GB RAM, and a 27-inch iMac with 2.8 GHz quad-core i7 CPU and 8 GB RAM. The ModBook has a 256 GB solid state drive instead of an ordinary hard drive, which somewhat reduces slowdowns caused when RAM gets exhausted. I want database operations to be fast on both computers.

Topically-designed databases help keep the size of my databases within a range that gives acceptable (to me) performance. The ModBook holds its own so long as I keep total word content of the open databases less than about 40 million words.

I’ve got two databases that reflect my professional interests in environmental science, technology, policy issues and laws and regulations. I spend a lot of time working with them. If I were to combine them, I would get poor performance on the ModBook, and possibly even on the iMac.

My main database contains about 25,000 references including papers, books and reports in many areas of science and engineering (chemistry, toxicology, ecology, etc.), environmental policy issues and laws and regulations (primarily in the U.S. and European Union) as well as thousands of my own notes.

Another environmental database contains many thousands of references that cover methodological matters such as sampling methodologies, chemical analytical procedures, quality assurance manuals, data evaluation procedures (statistical metholologies, etc.), risk assessment, cost-benefit analysis and so on. Again, these references include those in common use in the U.S. and the European Union. (I’ll need to update this database soon, as it has been more than a year since the last update.)

Aside from the performance benefits of subdividing my environmentally related materials, there’s another important benefit. The example I often cite is that if I’m researching the health effects of human consumption of fish that contain mercury contamination, I want to see references dealing with toxicology, case studies, sources and pathways of mercury pollution, standards and regulatory approaches. But I don’t want to see hundreds of references about to how to collect samples, prepare them, analyize them and evaluate the resulting data. By subdividing my environmental materials I’ve improved the focus of searches and See Also. The topically subdivided databases work more efficiently for me, and that would be true even if memory constraints did not exist.

Once in a while I may find it useful to open both databases and search across them. I can create, as it were, informational Lego blocks by assembling open databases for a purpose. In this case I may see a spinning ball, but I can achieve an objective if I’m patient. :slight_smile:

If I do see a spinning ball, I’ll quit and relaunch DT Pro Office to clear some memory. If I’m pushing the RAM hard, I’ll restart the computer to clear RAM and the VIrtual Memory swap files for a fresh start.

In 2009, cgrunenberg suggested that 100 million words would require “lots of Ram and a fast computer.” I wonder if this recommendation still stands today (2016). My database holds 501,376,086 words. Would that explain the slow speed and why it crashes so easily? If so, then, it is time to split databases. Bummer! I love to keep everything together.



Yes, the suggestion still stands. (I actually suggest an even more moderate 40,000,000 words / 4,000,000 unique words per database.)

Smaller, more focused databases will generally perform better, Sync faster, and be more data-safe in the event of a catastrophe (avoiding the “all your eggs in one basket” problem).

Thanks Bluefrog, for such a sobering advice. What if there would be some databases with a higher number? How do you recommend we manage them in order to keep it from malfunctioning. I am looking at one that is the double size you recommend, about 80,000,000


Nothing to be sobered by. It is a comfortable limit, not a hard one. Can you exceed it? Sure. By how much? Other than available resources, there is no exact limit (nor is there a 1-to-1 correlation between RAM / words / etc.)

I work Support so I will err on the side of best performance and data safety. I would rather have a User stick to the smaller, more focused databases for the reasons I mentioned. If he chooses to go beyond that, that’s his choice. It does not guarantee problems, but if problems arise, it is one factor to consider.