DB's structure

I am using a MacBook Pro 2015
I have a main DB and about 5 topical DB I keep using intensively
My questions are about my main DB
I am currently working in three languages (40% French + 50% English + 10% German)

I do an intensive use of Groups, smart Groups and Replicants
My structure was set while moving forward

My statistics are the following :


Groups : 23 715 (746 replicants)
Smart Groups : 1110 (8 replicants)
Archives Web : 79
Simple Text: 619
HTML Pages : 556
RTF : 59 218 (2709 replicants)
Image : 16 699 (38 replicants)
PDF : 11246 (1392 replicants)
Signets : 1093 (57 replicants)

Total : 125 102 (4952 replicants ) 18,3 Go

Words : 1 578 585 unique 248 097 700 in total

Remarks :

  • I have no problem whatsoever ref to speed
  • Everything is running smoothly
  • I do a check and repair every two weeks

Questions :

  1. I have no idea if this DB would be considered as big or average ? I would like to anticipate any issues ref to growth, do you think I can double or triple in size and still have no problems

  2. I do not use the AI only Boolean Research, actually I am the AI myself and classify everything in the adequate group and then I use replicants
    Now I am wondering if I could use DT in another way …If my main DB is considered too big then I intend to split my main DB in two and reorganize everything or maybe I should do it anyway ?

Thank you for your educated comments

I would consider this database “big”. If this was mine, I would consider strategies to segment it into two or three smaller databases. I would not plan to keep growing indefinitely. See Bill’s article here for some related thoughts.

As databases grow, maintenance and backup is increasingly important: 1/ remove unneeded data or rarely-used files by moving them to an archive. 2/ frequent Verify & Repair. 3/ occasional rebuild. 4/ occasional export of all contents to an external drive. 5/ regular backup to multiple destinations, especially off-site.

This is definitely a large database at 248 million words. I don’t know how much RAM you have on your machine but it sounds like you have a well-functioning one. It’s also nice to see you observing proper regular health checks too. (I hope you’re also being as diligent in your primary backups as well :smiley: )

As korm pointed out, splitting would not be a bad idea. Smaller, more focused databases will generally perform better, Sync faster, and be more data-safe in the event of a catastrophe (avoiding the “all your eggs in one basket” problem).

That being said, I am also curious, in an academic sense, what would happen if you continued on the path you’re on. I’m not advocating you continue to grow your database for my curiosity’s sake. It is interesting you feel it’s performing well with such a large index and I’m wondering what it would take on your machine for performance to degrade. Just idle chit-chat. :mrgreen:

Thank you for your answers
I am running the standard? Macbook pro with 16GO RAM + El Capitan + 500 GO SSD

Yes I am very serious about my back ups but after having read your posts, i am going to segment it asap …

Take your time … breaking apart a database, especially with replicants, takes careful thought.

Yes absolutely, I am thinking about the way to go regarding my organisation, not so simple actually, and I keep changing my mind (LOL) …