DT Pro 1 database of 3 GB imported to DTP 2, now 0.5 GB

My apologies if this has been asked and answered already.

I’m one of those late-adopters. Today I upgraded to DT Pro Office 2, and opened my old database. It was imported with amazing speed, but comparing the old and new dbs, there’s a huge size difference.

Scanning such a large DB looking for missing things is not a human forte, but nothing leaps out at me as not being there.

I’m hoping there is some sort of space-saving going on here…

How exactly are you checking the database size?

I’m checking the size in the Finder.

One reason a Finder “Get Info” window for a v1 database might show the size being significantly larger is that it contains Backup* subfolders that aren’t in its corresponding converted v2 database.

Comparing Finder file sizes of a 1.x and 2.0 database can be very deceptive.

Let’s assume that your database holds a lot of text (including rich text with images), HTML and WebArchive documents.

In a version 1 database those files are stored in the monolithic database and have to be loaded into memory when the database is opened. Now suppose you are using the three default internal Backups. That means that you have 4 copies of each of those files and its associated images, and that will ‘bloat’ the Finder file size.

Example: Capture a big Web page with lots of images as WebArchive. Now capture it as plain text. The WebArchive is much larger than the text file. In version 1, that WebArchive would be stored 4 times - in the active database, and in each of the internal Backup folders.

By contrast, the monolithic database in version 2 contains only the indexed text content of your documents and metadata, and so none of the formatting/layout code or images taking up space and being repeated in the internal Backups as in version 1. The files themselves are stored in the Finder in their native filetypes.

Although there are some differences in the way versions 1 & 2 Database Properties (File > Database Properties) report the number of files and word content, that’s a better way to compare your databases than raw storage size of the databases.

Thanks, Bill, for your clear and helpful reply.

I compared the same database in each version using the Database Properties… and this is what I found:
DTPro version 1

DTPro version 2

The new version has half a dozen files in it that the old one does not, so I don’t expect exact correlations, but what is up with the counts of “unique words/total words”?

The old db has more unique words, but fewer total words, than the new db.

Just to be on the safe side, is it okay to have both versions of DTPro running at the same time? Seems like it would be okay since they are using different files. Then I can look at them side by side before I delete the old. I have also found and read your post onthe thread Re: Changing my way of using DTPO for research, reminding me of the backup to archive script, so I have done that rather than relying on my time machine backups.

Your posts are so much appreciated, Bill! Thank you.

It’s OK briefly to run both applications in order to compare database content.

Afterwards, remove the older DEVONthink application from the applications folder. Otherwise, scripts and Services can become confused when adding new content.