Welcome @FrodeW
The first batch of data that I am looking at putting into DT3 takes 168 GB of disk space for a total of 130,611 files.
That is a considerable amount of data.
Due to the large amount of data I have decided to split the static reference documents into several databases based on topic and archive provenance.
That is not a bad idea at all. This is especially the case if DEVONthink To Go is or will be involved at some point.
Basically I would have liked to replicate documents across these databases and my project and research database, however this I understand is impossible.
Correct. You can’t replicate across/between databases.
You advice the users to duplicate the files into the project or research databases, which of course is a sound and valid solution.
This is possible but it’s certainly situational. This may be a good option in some cases, but not necessarily others.
However, I have found that I can do a pseudo replicate by adding a reference document’s item link into a bookmark in the project database. This seems to work fine, but are the hidden problems with this approach that is going to hit me later down the line.
pseudo indeed, but if it serves your purpose, there is nothing of concern in my mind.
How many words and unique words there are in those files I have no idea, so I was worried that this amount of data would exceed the DT3 recommended limits of 200,000,000 words and 4,000,000 unique words in a database. I have no experience in judging this.
Select the open database in DEVONthink and choose File > Database Properties to view the statistics about a database.