A clarification: I’m managing more than 150,000 documents among various DT Pro Office databases, but none of them holds more than 30,000 documents. Each database holds a “topical” collection that represents my needs and interests. My main database in fact contains many topics – many different scientific disciplines as well as legal and policy topics – that reflect my interest in environmental science, technology, law and policy matters.
Filing documents into specific groups in a database, the process of organizing the content, is something we humans like to do. It was necessary when we used physical file cabinets and categorized a sheet of paper into a specific file folder so that it could be retrieved quickly in the future.
But it’s quite feasible to compile a large DT Pro database that has no hierarchical file structure, no groups and subgroups. Some users have such databases. I tend to use some organizational structure for my own convenience, but at any time I’ve probably got thousands of documents that haven’t been “filed” and I’ll probably never bother to catch up.
DT Pro searches and AI features such as See Also and See Selected Text do not require hierarchical organization; they are effectively independent of organizational structure. In a few minutes I can compile lists of all the articles by a particular author that are in my database and all references to that material. If I’m viewing an article about how the populations of native species were affected by introduction of invasive species into an ecosystem, See Also will point out to me a “similar” article about how factors such as temperature, pressure or catalysts can affect chemical reaction equilibria. (Yes, there’s a wonderful relationship between those articles!)
DEVONthink can also assist one to manage and build on to an organizational structure of groups and subgroups classifying documents. Once one has developed structure in the database, the Classify AI routine can suggest one or more appropriate “filing” locations for a new content. The larger the database and the more consistently one has maintained the organization, the better Classify will perform. There’s even an auto-group routine that will suggest groupings for currently ungrouped items, based on contextual relationships. (I tend to view this kind of assistance as coddling the human craving to impose structure on things. The database doesn’t care.)
Other than some attention to organization, I rarely tag the content of an existing database, or tag new content when added. There is one activity, however, in which I often use keyword, hyperlink or annotation tags. That’s when I’m doing a new research or writing project.
The first thing I do when starting a project is to create a group for it, with subgroups for notes, drafts and reference materials. That helps me organize the project and gives me cues as to where the various elements I’ve pigeon holed can be found.
I’ll replicate or duplicate interesting references into this project. If I want to add keywords to the content or the Comment field of a document, I will so so on a duplicate (not on a replicant) of a reference document, as I don’t want to mess up the original. Anything that would alter the original, such as highlighting in a rich text document or PDF is done on a duplicate in the project. I may use temporary tags such as Label or State markings during the project. These are searchable and can be very convenient; for example, when I’m doing searches of the database for potentially useful references, or when I’m marking as potentially useful See Also suggestions I may assign a label color. When I’m drafting sections of a report I may assign a State that indicates whether the section is finished or not. But when I’m finished with the project I’ll erase all those Label and State markings so that I can start the next project with a clean slate. When the project is completed I’ll generally spin off the project files as a separate database for historical use, leaving only the finished article or report in my database.
But these are just my personal habits (or eccentricities); DEVONthink is flexible enough to accommodate other habits by other users.