Automating Going Paperless with DevonThink

Although, in a technical sense, groups and tags may be the same kind of entity treated in different ways, from the user’s point of view groups and tags are very different (because DevonThink treats groups and tags in different ways, groups and tags behave differently)!

However, groups don’t store files.

DevonThink is very different from what you might think initially: the entries in a DevonThink database aren’t files, groups aren’t folders and replicants aren’t aliases.

The entries in DevonThink are references to files. The files themselves are (by default) stored in the .filesnoindex folder in the database (you can move them outside the database if you put their references in an indexed group but this is rather tricky – try it only when you know what you are doing).

It is the references that are organized into groups, not the files themselves! For that reason, it is important to make a clear distinction between what happens to the file and what happens to the reference.

  • When you move an entry you move the reference, not the file.

  • When you replicate an entry (that is a reference to a file) you create another reference to that file (not a reference to the original entry as when aliasing!).

  • When you delete an entry you delete the reference; when you delete the last reference to a file in the group hierarchy (that is: when you delete an entry that has no replicants), the file is deleted too.

  • When you duplicate an entry you duplicate the file to which it refers and create a reference to that duplicate in the group hierarchy of the database.

From the user’s view point, groups are collections of references: if you delete a group those references will be removed from your database (and if an entry in that group has no replicants the corresponding file will be removed too!).

Tags, on the other hand, are, from the user’s point of view, properties of documents: if you delete a tag, this tag will be removed from all the documents but nothing happens to their entries (references) in the hierarchy.

Replicants provide a way to refer to a file at more than one place in your group hierarchy.

Tags provide a way to characterize files in a manner that does not fit with your group hierarchy.

For example, my group hierarchy sorts my journal articles according to their topic. When I design a course (a course typically covers different topics, but does not use all the literature I have about each topic), I tag the literature I want to use with the tag for that course. In this way, I can use my group hierarchy and the see also function to decide which articles to use and quickly access the articles I selected by means of the tag without making any changes in my group hierarchy.

1 Like

.

Korm, I completely agree with the quote above. Actually, it expresses the main reason why I stress the difference between groups and tags. In my experience (when helping colleagues and friends) the developers’ view that groups are tags and tags are groups is very confusing, precisely because groups and tags behave very differently (this is also clear from the irony in edgley’s nice “Tags and groups are the same except …”).

I have no idea whether or not I describe a deeper reality. I don’t care. In my experience, however, my description is more useful to my colleagues and friends (and hopefully to some users of this forum) than the standard description suggesting that tags are groups, entries files, groups folders and replicants aliases. The latter idea instigates all sorts of unnecessary worries (‘am I deleting the original or the alias?’ ‘help the hierarchy isn’t backuped’ ) and leads to bad design of databases (I have colleagues who can’t use DevonThink’s AI because they group according to the courses they teach, while their subject classification is done by tags) and to serious problems and file loss when indexing (remember the case of lprint174?).

Thanks to all for the time taken to help me understand the basics.

The problem that I have with this use The Group thinking is this:

I come across things I want to store, but dont know what I might want to use them for some day in The Future, so I put them in a folder called Things for Some Use Some Day.

Things I come across that have an immediate use, go direct into the appropriate project folder (group).

I start a new project, and wonder what I things I have collected might be of use for it.
So I either search using terms relevant to the new project, or have to look through the Things for Some Use Some Day folder, or look through all the other project folders to see if I can find anything of use.

But if I then have a use for something, and move it from the Things for Some Use Some Day folder, it wont be there for me to see if I start another project that might need it.

I am making this a lot harder than it needs to be?
Lol, wouldn’t be the first time.

When you don’t index and as long as your database isn’t corrupted, it is not relevant where exactly the files are stored, but it is very important to know 1) that they are stored somewhere in the database, 2) that the group hierarchy contains references to the files rather than the files themselves, and 3) that further details concerning the location of the files are irrelevant.

A quick look at this forum will learn that when someone asks where DevonThink’s files are stored, one of the people from DevonTechnologies will quickly answer that they are in the .filesnoindex folder. Apparently they don’t agree with you and me that this is irrelevant.

It’s worth pointing out that this is accurate from the point of view of some, perhaps even the majority, of users. However, over the years here I have observed many users using tags and groups in very ‘creative’ ways that don’t necessarily confirm to this model. One way in particular is if tagging is turned on for all groups in the database, then tags become a semi-automated system to replicate documents in the database. Tags are the collections of references for these users, and they file their documents accordingly using the keyboard instead of a mouse or trackpad.

What are the ‘things’ you’re talking about? Articles that might be of use the next time you buy a new computer? Reviews of books you might want to read? An article that sounds interesting for a paper you might want to write when all the papers you’re writing now are ready?

There are many ways to solve your problem. Here is one:

Suppose you collect all kinds of information concerning environmental education. Than you might group the articles according to their subject,. You might have groups for different kinds of paysage, animals, plants, teaching methods and son. Some articles are relevant to more than one group in which case you put replicants in all the relevant groups (an article about Birds in Ireland will have references (replicants) in the ‘birds’ and in the ‘Ireland’ group).

New articles arrive in the inbox and sorted with help of the classify pane.

If you can’t see a structure in what you collect leave everything in the inbox or a ‘Things for Some Use Some Day’ group (if you do the latter, be sure to exclude it from classify in the info panel) and wait until you have a least a hundred things, after which you let DevonThink autogroup them.

When you start a new project, you decide which things you want to use and tag them ‘project 1’. You can find the things you need either by searching the database, or perhaps you have one or two groups in your hierarchy that contain most of the relevant entries (for example if your project is to develop a program for a birding trip to Ireland it makes sense to look for articles that are both in the Ireland and in the birds group). When you found one or more very important articles, use ‘see also’ to find more.

For a second project you tag every thing ‘project 2’ and so on.

No problem if there are articles that are tagged for more than one project.

If you like to work with project groups, rather than with subject groups and project tags, I recommend replicating (rather than moving) the items to project groups.

I myself strongly prefer subject groups and project tags over project groups because projects are usually too heterogenous in content to make efficient use of DevonThink’s AI.

Thanks for the addition! I completely agree (and I have experimented myself which such deviant uses)! However, pointing to the possibility of creative uses is quite different from setting inexperienced users on the wrong footing by routinely claiming that groups are tags and tags are groups.

If you agree that groups are not necessarily collections of references and tags are not necessarily properties of documents, explain why it would put “inexperienced users on the wrong footing by routinely claiming that groups are tags and tags are groups”. I find it more helpful to teach new users by using the same terminology that they are going to find in the DEVONthink manual, even if there are some subtle differences in group/tag behavior. These small differences will not make any difference initially to a new user that is struggling to grasp how to use all the power that DEVONthink has. Contradicting the manual just adds more confusion.

I think I am over guessing what the AI can do to help.
I did write out a long post as to how I think I should now do things based on all your help, but I still didn’t quite get it.

Here is what I currently have:

RSS Stories
From various feeds, about lots of different things. Am converting stories I want to keep to pdf, adding tags like I would keyword a photo, then moving to a group called RSS Keep

Scanned Paperwork
Not very much. Was just going to tag and dump in a group called Scanned

Manuals
From software to things like my camera and telescope, pdf

Tutorials
For the same range as manuals, but video and pdf

Things I need to
Watch / Read / Listen / Examine / Download / Buy
These are either found directly, or are a variable on something already in DT

Things I might need
stories from RSS / webpages / images / video / anything I want to dump
No idea what they are for, or what they might be, just that I might want to use it one day

And here is what I am trying to achieve:

A place to store my paperwork on the slim chance I need it
A place to store manuals // tutorials so I can browse through them all
A place to store things that I like, no real use for them
A place to store things that I might need, for what is not known yet

And here is what I want to use all this stuff for:

I am developing a video game.
This needs a huge mass of information, ranging from stuff I find, to stuff I create.
When I find stuff, I will either need the whole thing (like an image) or a reference to part of it one bit of information (but might referrer again to it somewhere else).
There will be notes I create detailing the specifics of the game as well.

There are currently two game concepts I am investigating, so there will be two of these mass information pots.

However, as well as information specific to the game, there is information that is specific to designing a game; creating levels, learning software, etc

I am planning on using Curio as the creative space for each concept and DT as the black hole, with built in spot-light :slight_smile:

Thanks again for helping me get all this, tis really appreciated.

.

I see the point about not contradicting the manual and I guess that one of the things I wanted to say when I objected against routinely putting inexperienced users on the wrong footing is that the groups are tags and tags are groups mantra shouldn’t be in the manual.

To my experience and to the experience of the friends and colleagues I helped with DevonThink, the differences in behavior between tags and groups are not minor (think also of all the group or tag discussions - they would not make sense if the differences were really minor) and the decision what to do with groups and what with tags is crucial to the efficiency of a database (the colleague who used project groups and subject tags reorganized his database when he discovered that how efficient DevonThink’s AI can be when the references are grouped according to subject rather than project).

So explaining that for most purposes groups should be seen as collections of references and tags as properties of documents (perhaps with a footnote saying that technically groups and tags are the same) would be much more useful for them than the standard ‘groups are tags and tags are group’ tune.

  • you cannot use DevonThink’s AI to tag documents
  • you cannot move/duplicate references by typing
  • you run the risk of accidentally deleting files when deleting a group but not when deleting a tag
  • if a document has more than one tag these are not treated as replicants, neither is the reference of a document in a tag group treated as a replicant of that document in a non-tag group (for instance, if you delete a reference to a document that has no replicants it will be deleted, no matter how many tags it has!)
  • tags cannot be indexed

These are important behavioral differences and these differences make sense if and only if groups are seen as collections of references and tags as properties of documents. That the developer choose to implement groups and tags by means of the same kind of entity means good luck for creative users, but it is no reason to confuse inexperienced users by saying that groups and tags behave the same.

Tags and groups can behave the same, and in fact the default behavior for a new database is that they [i]are[/i] the same. If groups have tagging enabled, everything that you just posted above about tag behavior is wrong. I can have a thousand tags in a database, and have none of them appear under the ‘Tags’ group.

The argument can be twisted around and around.
The fact seems to be that there are differences between groups and tags, so saying they are the same is wrong; similar would be better.

However, as The New User, boy, its confusing! Even experienced uses cannot seem to agree. If there wasn’t a chance of losing a document (by remove a group) I dont think it would matter so much. This seems so important to me that its now become the central thing to base all other decisions off.

Is it your point that groups (or rather group names) can be made to behave like tags by disabling ‘exclude groups from tagging’ in the database properties?

If so I like to point out that

  • In the current version of DevonThink (2.3) the ‘exclude groups from tagging’ option is by default enabled, meaning that new users will experience the differences I describe.
  • Disabling this option does not eliminate all differences between groups and tags (for example, as you say, the groups (perhaps they should be called ‘group tags’ if ‘exclude groups from tagging’ is disabled) do not appear in the Tags group unlike the non-group tags, and more importantly, you can still not use non-group tags for AI.

Disabling ‘exclude groups from tagging’ is useful in several cases, including:

  • you prefer tags over groups and want to avail yourself of DevonThink’s AI for tagging
  • you are looking for a semi-automated system to replicate documents

There are, however, many set ups in which it is convenient to have groups and tags behave differently (IMHO having tags and groups behave differently is more intuitive than having them behave in the same way) and in this case it is helpful to see groups as collections of references and tags as properties of documents.

Perhaps we can agree that to new users it is best to explain that in the default case groups should be seen as collections of references and tags as properties of documents, but that this can be changed by disabling ‘exclude groups from tagging’, instead of unhelpfully repeating that contrary to what they experience, groups are tags and tags are groups?

You have adopted a paradigm that “groups should be seen as collections of references and tags as properties of documents” and personally I agree with the concept 100%. However, that doesn’t change how tags and groups work, nor does it necessitate that group tags be turned off (or on) for a particular use. The user can still assign a document to a collection with a group (gray) tag, and the user can still assign a property to a document via a tag (blue) tag. The tag (blue) tags will appear under the Tags group, and the group (gray) tags will appear under their respective groups. However all tags (blue and gray) will be displayed in the Tags view (command-6) and all tags can be assigned to a document via the tag bar. This is why Christian states that groups are tags and tags are groups.

Again, it may very well be useful to think of them differently (collections/properties) for the configuration and optimization of the database. I believe we are in complete agreement on that. But with respect to the underlying mechanics of tags and groups in DEVONthink, they are more similar than different. And with that thought, I should move on and get some of my own work done.

I am not sure that I have understood the first sentence (English is my second language and I have trouble understanding colloquial expressions).

Are you saying that you do not need DevonThink’s AI? If so, consider not using DevonThink at all. Consider using a single folder on your hard disk as your black hole with DevonSphere Express as a better spotlight.

For me it is important that the stuff I currently don’t use is organized into a subject hierarchy because that makes it easier for me to find the relevant stuff when I start a new project. I want DevonThink do the grouping and I want to determine what belongs to which project myself (with help of DevonThink) and for that reason I use tags for projects (I could also have used groups excluded from classification, search and see also, but using tags seemed easier).

I have the impression that for you the organization of the stuff not yet or no more in use, doesn’t matter (as long as you can find it by searching). In that case you could collect everything in a big ‘My stuff’ group and replicate the things you need to the relevant project group (by replicating rather than moving the items from the ‘My Stuff’ group to the project group you prevent accidentally deleting files when you delete a project group – there is always a replicant left in the ‘My Stuff’ group).

If you do want to avail yourself of DevonThink’s AI for tagging consider using groups for tagging the files in the ‘My stuff’ group. This is easiest if you disable the ‘exclude groups from tagging’ option of your database. If you do so, the names of groups can be handled in the same way as tags. When you tag an entry with such a group tag, it is replicated to the corresponding group.

If you use groups as tags it is handy to have a separate group (say ‘group tags’) with subgroups that serve as the tags (separate from the project groups). It might also be handy to exclude the project groups from classification, search and see also.

@Greg_Jones: I understand the mechanism and I understand why the developer says that groups are tags and tags are groups. I believe also that we are in complete agreement on what happens, as well as on the mechanism.

My point is a point about what to explain to new and inexperienced users. I think it is not very helpful to tell those users (in the manual or in the forum) that groups are tags and tags are groups or to say to them that for all practical purposes groups and tags behave in the same way, when they see before their eyes that tags and groups behave differently (as Edgley puts it: “The fact seems to be that there are differences between groups and tags, so saying they are the same is wrong”) and when it is important for the design of their databases to understand the differences or, perhaps, to minimize the differences by disabling ‘exclude groups from tagging’.

I belief it is even less helpful to dismiss ‘the groups are collections tags are properties’ model (which is often a useful model but, as you point out, it doesn’t describe the mechanism, and other models can be useful too) as too technical and to canonize the groups are tags and tags are groups view as the correct user experience (as Korm did).

Of course, my opinion on what is helpful and what is not, is based on my very limited experience (some friends and colleagues and some discussions in this forum) and your mileage may vary.

Sorry for being unclear, I get told that by people who speak better English than I :slight_smile:

I do wish to use the AI, that is the main reason for going for DT.
I have made the plunge and have started a group way of thinking.

It means I have lots of folders called things like eBook and images, but we shall see. At least its not a major pain to have to change things around.

I shall try using smart groups in the far left window for projects so I can get some visual separation from the actual data. If that doesnt work then something will be tried.

Thanks all again.