Average size of databases you are using

What is the average size of your database(s) you are using (particularly if you import info)?

I am trying to get an idea of the typical size before I should be concerned about restructuring or splitting.

I have opted to “import” so I know it is naturally larger. But I wanted it mobile etc.

thanks in advance

Cletus

My main database is 482 MB, about 1.8 million words. (That includes the automatic backups, which are stored within the database package.) Most items are imported, though most are PDFs and are therefore stored as separate files within the database package. Performance is fine (on a 2006-vintage Intel-based iMac) , so I haven’t considered splitting it.

Katherine

No need to split it at a mere 482 MB, including backups, and 1.8 million words. It isn’t even straining at that point.

My database is over 8 GB, including backups, and currently contains 64.5 million words in over 77,000 documents, mostly RTF’s. I fully expect to exceed 100 million words without any difficulties.

I use my database all day long, every day, and only ever quit it when I need to restart my system.

Hope that helps

Rollo

Holy cow… I guess I don’t need to worry about the size then. I probably will never hit that much, but it is nice to know one has room to breath…

thanks for the info.

Cletus

About halfway through importing & organising. ‘Reference’ database is at 26.7 million words (1.2GB). Runs exceptionally well.

Mine is just under 1GB with just over 17 million words (400K unique)

Rollo: do you get any beachballing with a database that size?

I sometimes find that navigating from folder to folder (three pane view) tends to produce beachballs, but if I ‘backup and optimize’ it can make things better for a while (until after I’ve been importing quite a number of items).

Yes I can get beachballing if I’m not careful, but nothing too ridiculous. This seems to be more a function of having lots of apps running, and which ones they are. If I’m not careful I will have up to 20 apps open at one time, and if those apps are like Safari that build up massive caches, it can start slowing down the entire computer. I also try to back up DT once or twice a day when I’m off to have a meal etc. as I find the backup process itself consumes a lot of processing power and slows the entire computer down to an unacceptable degree. Sometimes after a DT backup, the whole computer slows down dramatically if I have a lot of other apps open and the only answer is to quit everything and restart. If I do that, or close most of the currently running apps, DT speeds up again. As long as I manage this juggling act, the beachballing is usually not too bad.

Rollo

Interesting. I suspect the reason I see the pinwheel is similar to yours: several other apps open. Maybe not 20, but resource hungry ones like NetNewsWire, Firefox, Entourage, PathFinder. With an 8GB database, I assume you’re doing off-site backups?

Part of the fun in having huge databases is finding stuff you never thought you had. Happens to me every once in a while…

Yes indeed … happens to me all the time … but that probably doesn’t surprise you! DT has truly become my supplementary brain. I throw loads into it, and try to organise it all in a logical manner which means I can go straight to what I’m looking for in a matter of seconds, and that matters to me.

I use many of the same apps as you, but prefer Mail to Entourage. Yes I do offsite incremental backups of all my data. In theory that should be every day, but in practice it is less often. But I do use DT’s internal backup two to three times a day, and I’ve had reason to be glad of that a couple of times. I find it helpful to reduce file bloat, and it just makes sure I can’t lose everything I’ve been working on.

Rollo

Hmmm… maybe I’ll should start making regular backups part of my routine as well - i normally do a daily clone of my entire hard drive at both my home office and at work, and have been lucky thus far.

I can’t wait until version 2.0 provides search parameters not unlike Agent, which is the rumour. That will be truly excellent.

[quote="Part of the fun in having huge databases is finding stuff you never thought you had. Happens to me every once in a while…[/quote]

That is an interesting observation. Another question I was going to ask involved how everyone goes about their Folder structure. Since i do an import typically, I try to keep the folders much like my HD folders and then synchronize.

Though I do try to avoid excessive sub folders upon sub folders etc. I keep my parents right around 10. I have five more, that I do not sync that contain typically video’s or PPT/Keynote presents that I would never look at in DTPro anyway, so I avoid bringing them in.

It would indeed be interesting to see screen shots of how others have DTP set up.
Cletus

Seeing a screenshot of my top level will tell you little about the underlying structure in which I store my 77,000 documents. I also use quite a few replications where I need to access the same material under different headings. Suffice to say I have 32 top level folders, about half of which have a major structure of subfolders 3-4 levels deep. The others are at the top level only because of convenience and because I’m always accessing their content.

Rollo

I have 11 top level, but only 4 or 5 get regular and heavy use. Depth is usually fairly minimal in most of them, but a few drill down to 6 or 7 subfolders. For me, anything more gets a little confusing, but that’s just a personal preference.

I don’t think I have any replicants, but I do have my entire “research” folder (I’m an academic) indexed in DTPO, which means all journal articles, reports, etc. are indexed. This accounts for the bulk of my word count, I’m sure.

FWIW, the blog academichack has a recent post on tracking down errant references that is interesting: http://academhack.outsidethetext.com/home/?p=190

David

it is around 1.2gb with its single backup. 1.6billion words

pictures.tmttlt.com/main.php?g2_itemId=23261 is a few weeks old now, but has the basic info.

You can’t be serious. Do you mean 1.6 MILLION or 1.6 BILLION? A billion is one-thousand-million, i.e. 1,600,000,000 words.

Rollo

Must be a typo, judging from the screenshot he posted.

I have 973 groups, 33 rich texts, and .5MB total size :smiley:

I’m rebuilding the database from scratch, which I do fairly often. I typically get up to about 5GB and then restart it so that I can incorporate new organizational ideas and such, eliminate unnecessary/temporarily interesting information, and so forth, which cuts the DB size down to about 2.5-3GB. What I’m working with has to make sense in my head, not just the database, so I often have to restructure massively to keep the system coherent. It’s easier just to start over, most of the time, and import from a file-and-folder export.

I have to admit that my database does get sluggish fairly quickly, but I’m fairly certain that it’s my 1.25GB of RAM to blame. I’m on a MacBook, so I can get another 1.75GB in, but I dunno when I’ll have the money.

there are 1.4 million unique words… 1.6 billion total words, the screenshot says 1.4 billion total words.

Good catch. Thanks for correcting me.