I’m looking for limits on how many words, documents, tags and such I can use in a single database with DTPO, as well as its web server. I’m setting up a machine exclusively for my DTPO use and want to have the ideal setup. Cost is important, but since this is my livelihood, I have a relatively large budget.

I’ve found mention of 100 million items and 10 million words unless I have a “fast” computer with a lot of RAM. I’m purchasing a new Mac Pro with 128GB RAM (Apple’s BTO max is 64GB, but OWC/MacSales sells a 128GB upgrade).

Can DTPO take advantage of a large number of cores/threads? If it can, I’ll purchase the 12 core/24 thread version. If not, I’ll get the quad-core.

Does DTPO or the OCR feature use OpenCL? If so, I’ll get the D700 GPU’s; if not, I won’t.

I plan to store the database on a Thunderbolt2-attached SSD RAID external NAS. Does DTPO require much/any space on the system/boot drive? I’m planning to get the 256GB internal drive. DTPO will be the only application installed, so 200GB will be sitting there idle. If DTPO can make use of it, I’ll upgrade to the 1TB. Are there temporary files or swap files that can make use of the boot drive?

Is there any way to calculate the impact of a large PDF and/or items in a database? For example, for each 1000 unique words in a PDF, you should have 1MB of RAM; for each 1000 items or tags, you should have another 1 MB of RAM.

Are there any options I can tweak or suggestions on how I can best configure DTPO for maximum performance on large databases?

Finally, how many connections does the DTPO web server handle? I’d like to put the Mac Pro in our server room and access it from my desk, or over screen sharing. Is it safe to share my password with my coworkers so they can access my database too, or is that pushing it? I know it’s not multi-user, but if they try to access it while I’m doing something, will their login be rejected or will I lose data/get disconnected?

Usually up to 300 thousand items and up to 300 million words per database should be okay.

Support for multiple cores is currently quite limited, therefore the quad-core should be sufficient.


Only the application, the OCR resources and the global inbox have to be located on the startup volume, all other databases can be located anywhere.

The performance should be usually only limited by the network and therefore several users should be able to access it concurrently.

Everything should be fine as long as they’re not deleting/modifying the file you’re currently working on.

