New Database Strategies in 2.0?

Michael_Jennings · December 19, 2008, 2:05am

I’m sure I’m not the only user who, confronted with the possibility of multiple open databases in DT 2.0, isn’t sure how to proceed. I’m eager to hear how others are breaking down their single, presumably enormous, databases!

ndouglas · December 19, 2008, 2:22am

I split my primary project into two “layers” (and I also downloaded DEVONnote and made it part of my workflow). And I’m now creating databases for things for which I didn’t use DTP at all… but other than that, the new strategies have been more based around smart groups and the other features of DTP 2 than around the multiple database capability.

chatoyer · December 19, 2008, 2:38am

Have been thinking about this very question all day. For now, I’m sticking with one large database. It has served my needs thusfar, so I don’t see a pressing need to split databases. The new search features, which I’ve yet to fully try, are MUCH more appealing.

KeithKendrick · December 20, 2008, 6:22pm

I am very interested in this question also, but I don’t have much experience yet. My suspicion is that there isn’t much of a downside to having a few databases - although Bill would probably have some good thoughts on that. The upsides to splitting things up have largely to do with efficient searching, and with moving or backing up to different drives, I think. So, I am probably going to have separate databases for collections of information that I would typically search separately - i.e. personal vs. work; or even have a separate personal database having to do with particular not-for-profit stuff that I’m interested in.

A related difficult question, I think, is how much (folder) structure I should use within a particular database. It seems to me that version 2 might enable a much flatter structure because of the ability to attach specific metadata to each file as it is created or accessed. Am I right about that?

I would be very interested to know the thoughts of others on both of these design questions.

Bill_DeVille · December 20, 2008, 7:59pm

I’ve talked before about how I set up topical databases. For example, I have a main database dealing with environmental science, regulatory and policy matters, and most of the searches and See Also operations I do within that database are pretty well focussed and useful. But I split out into a separate database all those specific technical references such as analytical procedure manuals, data evaluation and quality assurance protocols, etc. as that improves the utility of searches and See Also in both databases.

Once in a while, though, I have to look in both databases. Now, in DEVONthink 2, I can have both open and search across them, or limit searches to just one at a time. That’s a distinct improvement over DEVONthink 1 for me.

At the moment, I’m working in a new open database that deals with DEVONthink 2 pb1 issues that I come across in the forum or in Support messages. It’s a convenience to do that, and to be able to limit searches just to that database. It really facilitates handling Support issues.

I suspect I’ll end up building more topically “granular” databases, and take advantage of concurrently open databases to assemble them for specific purposes, like Leggo blocks. For example, I’ve got a financial database that holds, grouped by year, banking and investment records, tax records, etc. Instead, I could have small, very quick databases designed differently, that can be searched individually, or in a collection for a specific purpose.

And I’ve already started handling projects as new databases. I can quickly pull in or refer to reference materials as needed from other databases.

sgmiller · December 21, 2008, 12:12am

I have a basic collection of about 200,000 items/9 gigs that I store in normal folders along with my DT databases. When I have a project, I use this store of information together with Internet/LEXIS searching as my “world.” I have been using FoxTrot for searching my own info store and never even considered transferring it to DT until now. Since 2.0 has new search features that are interesting to me (spelling in particular since FT has no “fuzzy” search), I am intrigued by the idea of dumping my files into DT but wonder, along with others, how well a single database that big would perform.

I assume that indexing is a better strategy for something like this?? Not having the actual docs in the database must really cut down on the size/speed.

luix · December 21, 2008, 11:45am

kalisphoenix: I read in the forum, that you are a frequent and heavy user of dtp and now you decided to use DevonNote as well. I tested DevonNote. It’s a nice little app. But I have no idea for what to use DN when I may open a big bunch of DT databases.

Please, tell me, for what you are using DevonNote.
Thank you.
Lutz

kewms · December 21, 2008, 5:00pm

I use DevonNote as a desktop scratchpad. It gets ideas, interesting clippings from the web, etc. I use it instead of DTP because its system footprint is much smaller, and also because the items don’t always fit into my existing databases. For instance, quick notes from a conversation with a potential client might go into DN. Some of these functions may be obsolete once the new Sorter is fully implemented, but for now DN is very useful.

Katherine

ndouglas · December 21, 2008, 10:43pm

I have several DTPO databases, two of which are concerned with a massive project that I’ll be working on for a couple decades. In the past, I had just one big database (since that was all I could have open). Unlike most people (from what I gather), 70-80% of the information I store is written by me. That leads to some problems, chief of which are Unfinished Documents.

A lot of the time I’ll start writing a document and go through and realize that half the terms I used need some explanation. So I’ll create new documents for those terms. By the time I have 1000-1500 words in a document (about an hour), I may have created 50 or more new documents. I might go through each of those documents and write a little (250-500 words) description just to be able to later remind myself of what I should write in that document when I have more time. That might mean 5-15 more new documents (did I mention that this is a very complex project?), so I create those documents.

The end result is that I’ll have one document that’s “complete” (1000-1500 words) and 100-200 fragments that are only notes for documents that need to be written later. This is an entire hierarchy that is just To Do stuff. I could have this information in a separate DTP database, but why? I won’t be importing any PDFs, pictures, or any other format requiring the built-in formats or QuickLook formats of DTP. Also, I frequently haven’t decided where the permanent home of this information might be. I might combine topics, split them apart, rewrite them, change names, and so on. The bottom line is that this information is Not Ready for Primetime, so I don’t want these things screwing up my concordance, making my lists of documents longer for no real reason, and so on.

I guess the bottom line is:

I use my DTPO databases as “final drafts,” in a way. I know that every document I have in those databases is in the right place, has a lot of useful information, is well-formatted, and so on. I know that if a word turns into a WikiLink, that it’s linking to a useful document.
I use my DEVONnote database for “notes” and “rough drafts.” It doesn’t matter how sloppily or stupidly I input information. I don’t have to worry about where it’s going to go. I can create thousands of documents and later delete half of them. I don’t have to worry about backing up the database because it doesn’t contain anything critical.

Another reason, if you want something logical and not based on my (completely irrational) workflow, is that I frequently do a lot of massive information management with DTPO. I’ll OCR a folder full of literally hundreds of PDFs, or import a large website, or perform some other action that ties it up completely, for hours at a time.

But I don’t want to stop working. Thankfully, DEVONnote is completely unaffected by DEVONthink’s rampaging processes, and I can type smoothly and swiftly in five or ten different documents, or in fullscreen, without having to wait impatiently for DEVONthink to finish importing the Encyclopedia Judaica or whatever.

(There are a couple of other reasons – for instance, two DEVONapps means two independent sets of labels [I’ve complained about having only seven labels and having them tied to the app rather than the database, but to no avail] – but these are probably minor for most people)

So using DN might not make sense at all for your workflow. I expect it would make more sense for someone using DTPO than someone using DT or even DTP, since the OCR functions can frequently take hours (the Oxford Latin Dictionary being a painful example). Simply using multiple databases might be fine for anyone who doesn’t have to screw around with OCR a lot.

luix · December 22, 2008, 8:38am

Thank you for your extensive explanations. I’m interested in how other people work with dtpo (and other tools I use).

I use a »note«-db too, but since the beta I use a DTPO DB for that. It’s cute and small I like to use as little apps as possible. But of course I haven’t so much OCRing, so that I never felt cramped with that.

JRPars · January 1, 2009, 7:35am

I use DTPO 2x for book research, fiction and nonfiction. Although I maintain a general writing database, in which I store research for shorter works, final drafts of shorter works, scans or pdf of published short works, scans or pdfs of reviews of my books, jottings about plots, characters, settings, etc., income/royalty statements, etc., I use a separate database for each new book project, unless the book is part of a series and then the research for all books in series is stored in a single database.

Prior to the release of DTPO 2x, I also used Together because I found it easier to grab short pieces of research, articles, etc., that I came across on the web and to jot short notes. I think the developer did an excellent job with the Together shelf. I would then export the gathered information (which automatically was imported into a DTPro smart folder in Together) to DTPO 1x on a regular basis. With the addition of a similar drop mechanism in DTPO 2x, I may move away from Together.

For writing, I haven’t found DTPO1x or 2x to really work for me–UI preference vs. capability–and use Scrivener for its UI, adaptability, and its capability of storing relevant research at hand. (Prior to writing in Scrivener, I cull all primary research I’ll be using from what I’ve gathered in DTPO and import it into the research area of Scrivener’s binder.

I know it’s a multi-step process, but it’s the one that works best for me. If I can cut out the Together mid-step now that DTPO 2x is here, it will be aces.

dylan · January 2, 2009, 12:58am

It seems that one of the big reasons users use more than one data base is that it helps with searches, I’m assuming making them more accurate.

But since you can search within a certain group, is that really necessary? Wouldn’t a financial group, and the ability to search through just that group, accomplish much of what a financial database would do?

Bill_DeVille · January 2, 2009, 5:26am

Yes and no. Yes, one can restrict a search to a specific group within a database, so one could just search the financial group alone. But one can either select a single group, or the entire database.

In DEVONthink 2, the Info panel allows one to exempt items from Classify, See Also, Search and/or Tagging. But as such exclusions are not inherited by children, and involve manually checking and un-checking exclusions for various purposes, that gets pretty cumbersome for trying to “isolate” some portion(s) of a database.

As a practical matter, I find find value in being able to disaggregate (separate) various kinds of information, yet also find value for other purposes in aggregating (combining) information.

For about three years, in the early versions of DEVONthink, I was limited to a single database to hold everything. I was delighted by the ability to disaggregate information into separate databases in DEVONthink Pro 1.x. But then it was difficult to aggregate information from different databases.

The ability of DEVONthink Pro/Office 2 to manage multiple concurrently open databases and to search across one or the entire open collection means that I can enjoy the benefits of improved search and See Also focus, as well as database responsiveness, that come from segregating content into topically-designed databases. At the same time, aggregation of the information contained in several databases is now possible by opening multiple databases. If necessary, on the fly, I can easily merge information from several databases into a new one, perhaps for a temporary purpose, perhaps as a working redesign based on need.

Bottom line: If you are satisfied with a single database, great. If at any time you find it useful to develop more than one database, that can be done with DT Pro/Office. If you need to work with more than one database open, that’s now possible with DT Pro/Office 2.0.

dylan · January 2, 2009, 7:09am

Thanks Bill. That explanation helped a lot (I copied it and put it in DT for future reference). I know a lot of users have thousands of entries in their db. I’ve just started using DT so I have much less. I think I’ll do what you suggested. Start out with Personal and if the need arises switch to Pro.

Thanks again.