Semi-organized chaos

DT Pro user here. I have a database containing nearly 5000 documents, 4,000,000 words. This is where I drop any articles that relate to my interests or fields. Because I teach several subjects, that encompasses a lot of topics. The majority of these are pdf documents.

Currently I have this database semi-organized into folders by topic. The problem is that many are hard to categorize and may relate to several areas. I don’t even remember all that is stored here; when I am looking something I check the folders first, and if that doesn’t turn up what I need, I search.

I am not a consistent tagger. Nor am I very diligent about getting things into folders. I do not often use the See Also function. I would call this database a semi-organized mess.

Question: is it better to folder-ize documents and replicate them to any other relevant folders, or to dump them all in a few common folders and use the search function to pull up what I’m looking for? Or should I train myself to tag everything? Or should I be using AI to group things better?

I realize everyone’s needs and preferences are different, but I would appreciate any opinions on whether the search function alone is good enough to find items in a large, varied database. I am tempted to ‘ungroup’ a lot of items, but this would be hard to undo, so I’m asking for advice before I take this step.

If there is a thread that answers this, could someone point it out to me? I did search through the forums first. Forum contributors here are so helpful and passionate about DT that I can’t help but believe that someone here can offer me sound advice.

Thanking you in advance,

If you haven’t read Joe Kissell’s “Take Control” e-books, Getting Started with DEVONthink 2 and Your Paperless Office, I’d recommend one or both of them. Many opinions expressed on the forum are reflected there. Another good read is Eric’s blog. Finally, read anything that Bill DeVille has posted in the forum about workflows using DEVONthink (here, and even more).

That said, I’ll give you my own opinions about some things. I regularly use a dozen or so databases the size of yours, or larger. As does Bill DeVille, I use subject- or project-oriented databases. Except for one, Daybook, database that is my everything-that-doesn’t-fit-elsewhere repository. Other users keep only a few, or one, database. It depends on operating style more than anything, though size does matter in some key respects.

I prefer more folders, rather than fewer, and a “meaningful” hierarchy of groups, rather than a jumble. But, I didn’t get there over night, and my structure isn’t static. I’m regulary pruning, moving, regrouping, and ungrouping documents. Because database structure for document repositories is usually emergent - developing as your collection grows. So, don’t be concerned about getting the structure right from the get-go. It will never be “right” :wink:

Only if it helps organize and locate data. Tagging can be an enormous investment, and there’s always the nagging thought about whether we’re doing enough, or the right thing, or too much or… I tend to tag things, sometimes, when they go into the database. But over time I’ve done less of this. Because I’ve realized that all the hours I’ve spent tagging has never saved a second when it comes to searching and finding. Partly because DT’s tag searching tools are just not very good. Mainly because, my opinion, tagging isn’t a practical way to do taxonomy.

Abosolutely. You’ll find the AI works better and better with more, rather than fewer, groups, and smaller, rather than larger, documents. Because AI’s pattern matching works better with more granular data. The AI does nothing with tags - another reason not to invest in that practice, IMO.

Search is your friend 8) I’m into the search tools all the time. First, the global Tools > Search panel for multi-database searches with a wide range of options. Combined with the previews, and using the View > Widescreen option, this panel operates almost like a database-of-databases. Second, the Find box on the toolbar, for searching inside a single database. Third, Smartgroups. In any search, I often find that the “fuzzy” option gives me better results than having the option off, especially when I’m not sure what the thing I’m looking for is named.

Ungroup when your thoughts of structure change and emerge. Then regroup. Keep at it when it makes sense – but not compulsively - never try to reorganize a whole database in one sitting unless you’re absolutely sure the investment in time has a definite payback :exclamation:

A final thought: Learn about DT’s scripts - and maybe even learn a little scripting. The scripts are useful for working with your data. or between DT and other apps, such as OmniFocus. For example, browse houthakker’s excellent library of scripted tools.

  • Work with the annotation templates.
  • Use replicants liberally.
  • Use duplicates only when you’re going to modify one instance (e.g., use PDF annotations) and you don’t want to affect all instances.
  • Use indexing when you have a need to maintain a folder/document structure for use outside of DT.


Thank you - this is exactly what I needed to understand! I always have the feeling that I’m not using DT as well as I could; your reply gives me an idea how to organize things so I can find them more quickly. :slight_smile:

Much obliged.

This thread should be “stick” at the start of this sub-forum :smiley: