Inbox vs databases

martinrice1 · March 19, 2017, 11:58am

I’ve just imported about 1600 items into DTOP. They’re all in my inbox. I don’t have any databases set us. The 1600 items are distributed into 49 folders. Here are some questions I have:

Are these folders groups?

Is the inbox itself a database?

Is it effective to just use the inbox with folders (groups?) as my organizing practice, or is it better to created databases and move these folder into them?

TIA

scottlougheed · March 19, 2017, 3:06pm

In DEVONthink, Folders = Groups. Groups/Folders in the Global Inbox (that’s the “Inbox” you’re referring to, since individual databases have their own “Inboxes” as well) are the same as anywhere else in DEVONthink.

The Global Inbox is a database and within DEVONthink, it functions much like any other database. However, the Global Inbox functions in some different ways from user-created databases. It is not transportable† like user-created databases. User-created databases are represented on your computer by a .dtBase2 file. This file can be stored anywhere* including external drives.

In theory you could potentially get away with using only the global inbox as your one and only database but there’s really no benefit to this (I’d argue that if you wanted to attempt a 1-databse workflow you’d be best to create a single user-created database and attempt it that way. But this ultimately gets into an active debate regarding how many databases is appropriate:

Personally, I see the global inbox as simply a staging area – incoming files go to the global inbox so I can capture things throughout the day without having to devote a lot of time to thinking about where it belongs while I’m in the moment. Then I take time to move things to the appropriate database’s inbox, and from there I sort it to the appropriate group (or let the AI sort it for me, depending on the situation).

I generally like to have NOTHING live permanently in the Global Inbox, and this is a strategy too can use regardless of how many databases you use, whether you choose to create only one single database, or many more focused databases.

The advantage to not living entirely in the Global Inbox is that it is relatively easy to duplicate a database and try new organizational schemes out in a non-destructive way, or retrieve a database from a backup should something go awry. While the Global Inbox database is backed up along with the rest of your system (Assuming you have a backup!) it’s nestled way in the library/application support nether regions and is thus bit more cumbersome to track down and restore. A user-created database lives where you want it to.

Hope this helps!

†Not transportable as a database file – it CAN, however, be synced using DEVONthink’s built in sync functionality.

*Except for directly in cloud-synced folders. If you want to sync database contents between computers or with iOS you use the Sync function built into DEVONthink. You cannot sync a database by storing the .dtBase2 file in a cloud folder.

ˆa common workflow for me is to scan a document with my phone using Scanner Pro, and import it into DEVONthink To Go, and in many cases this import process defaults to the global inbox

martinrice1 · March 19, 2017, 3:46pm

Scott, thanks for this in depth statement about the global inbox vs individual databases. It explains a lot for me and makes a great deal of sense.

In a certain sense, it will be easy for me to create a database organizational scheme given that the 1600 + items that I imported from Evernote are nicely organized into groups already and I can divide these groups up logically into some different databases.

But before I begin, I’ll take a look at those links you included about database organization.

I understand exactly what you mean when you explain how you use the global inbox. Both in OneNote and Evernote I had a notebook named inbox that I’d throw everything into over the course of the day and then later distribute the items to the appropriate notebooks.

Again, thanks for this clarification about the global inbox. I appreciate the time you took to put it together.

scottlougheed · March 20, 2017, 12:50pm

Glad my post helped!

Picking how many databases to use is a very personal choice that depends a great deal on how you work and the type of data you work with.

One major consideration is that the AI/See Also/Auto Classify features only work within the same database, and not across databases.

I think organization and retrieval is really the major consideration when thinking about number of databases. There are a couple main considerations related to this:

False positives in the See Also feature (that is – items that the AI thinks are related but aren’t related according to you. This can happen if you have a lot of material that is unrelated by which might use the same words)
Mis-classification in auto-classify (auto classify either puts something in the wrong place according to you, or is unable to classify because it cannot identify a clearcut group to send the file to. This is both an issue of the overall relatedness of your database, as well as your chosen classification scheme – that is, this issue can just as easily arise in a small but badly organized database.)
False positives or difficult to parse search results (while the number and relevance of search results is, of course, also a product of your query, having a lot of files of questionable relatedness together can muddy results, especially if you are using a fairly vague query.) (also worth noting that the big ⌘-shift-f search window sill search all open databases (if you so choose, and is the default), so you CAN perform manual searches across all databases).

But all of the above considerations depend a great deal on how you see your data and how you use it. Two people might choose to treat the same set of data very differently and see merit in consolidating it into one database, or separating it into several focused databases.

Database SIZE (both file size and number of files) seems to have a very high upper bound before any performance degradation takes place. Database performance has been, for me, rather stable regardless of the size. For example, I have a database of 1800 files, predominantly PDFs, totally 35,000,000 words with a size on my disk of 3.1gb. This database opens rather quickly and navigating through it is effortless. I know this is far from the largest database that users on here have, but this library of files has given some other applications issues, but not DEVONthink. I feel confident that this database can grow probably several times its current size before I begin to see any issues at all.

The beautiful thing is, moving things into, and out of, and splitting up/consolidating databases is actually really easy. I recently went from about 9 databases to 4. Part of this was simply deleting unneeded files, but a large part of it was consolidating databases that worked just fine together – where their being in the same database enhanced retrieval and search because data that was originally split across databases. (related: Tips for splitting a database)

martinrice1 · March 21, 2017, 1:30pm

Thanks so much for your thoughts on database organization.

As it turns out, my needs are quite modest. I don’t do any type of academic research anymore. Basically, I now use DT primarily for filing a bunch of stuff that pertains to the family and my current activities and interests, three of which I’m deeply involved with.

After thinking about it a bit, it looks as though I’m just going to need about 4 databases to keep things organized. I won’t really have any need for features such as See Also or auto-classify or even searching all databases, though the ability to search all open data bases manually could come in handy.

I expect that having a box for each database in the sorter will work well for most of my filing needs along with my bookmarklets for when I’m on the web.

Fortunately, as you point out, manipulating the databases is easy; if I find out later that my current assumptions about my usage are wrong, I’ll always be able to make necessary adjustments.

Thanks again for the information.