Tags and AI or Machine Learning

After using Ryan’s script, I have to say that it works pretty well.
The other thing that I have learned is the following:

  • You can click on any word and Add it to the Tag in the document, by right clicking. That is a useful feature.

  • Any of the words in the concordance list can be clicked on as well to add to the list. Especially when you can click and organize the words.

So my workflow is the following:

  1. Use Ryan’s Script to get general tagging.
  2. Remove some of the Tags
  3. Look at the Tags with Concordance, or if I am reading document and not just saving the document then I will Tag as I go in the document.

Oh and of course I have a Smart rule to clear all the Empty Tags once in a while.

1 Like

Just a tiny correction: while I might be it’s biggest advocate, the script was put together by DT staff! Credit where credit’s due.

And indeed, that’s roughly my workflow as well.

I am thinking about some other applications, though. For instance, I’ve recently been paying more attention to my blog, which has a particular focus (and therefore only really needs a certain collection of tags). It should be trivial—though tedious—to create a list of those tags and let the auto-tagging script identify and add them whenever I author a new post. That is, run the script as normal, but filter the options by a preset list such that only those preset tags are added if they come up.

Then, I’ve been thinking about a fun extension of that concept. Say I have a tag “Education”, and I write a lot of things about that topic, but a lot of what I write doesn’t feature the exact term “Education”. Instead, things like “University” and “Academy” and “Learning” come up.

It shouldn’t be difficult to develop a list of those related topics. Then, the auto-tagging script can use them to “seed” the Education tag whenever they come up. This way I don’t have a blog with a couple dozen education-related tags—I have just the one tag, but the system can automatically tag posts that relate even if that tag isn’t mentioned.

Very interesting and thanks for (drawing attention to) these scripts!

I would like to use the scripts to tag a database of PDFs but with a concordance that excludes all german words except nouns.

So far I’ve found that it’s possible to add words to ExcludedWords in /Users/user/Library/Preferences/com.devon-technologies.think3.plist and afterwards reload the plist with defaults read "/Users/user/Library/Preferences/com.devon-technologies.think3.plist" in Terminal.

Has anyone links to german corpora of verbs, adjectives etc.?

1 Like

So far I’ve found that it’s possible to add words to ExcludedWords in /Users/user/Library/Preferences/com.devon-technologies.think3.plist and afterwards reload the plist with defaults read "/Users/user/Library/Preferences/com.devon-technologies.think3.plist" in Terminal.

This smells like trouble. I would see if @cgrunenberg has any advice on what you’re proposing.

1 Like

It’s definitely not recommended.

I imagine you could put the Excluded Words within the script, though, such as by having DT skip them? It might make the script slow…

I have a (probably dumb) question on this. I’ve been looking into tagging my files, as as I can (now) see the importance of tagging a document with keywords that don’t appear in the text as a way of creating connections to related documents. However, what’s the advantage of tagging with words that already appear in the text of the document? I assume DT3 already makes these connections in the See Also and Classify.

It’s a good question, and there’s some personal preference involved in how you want to approach auto-tagging.

I’ve written on another forum about this. My post is pasted below with some modifications for See Also:

Tags—even programmatically generated ones—offer a couple of things that are above and beyond [search queries and see also].
One is browsing. It’s cognitively easier to look at a set of tags as cues as you’re looking for something and to click one than to generate a query for the same set of features represented by that tag [or to have to select files to see See Also]. Instead of a tag, though, this could be implemented through some kind of saved smart search.
The second isn’t really related, but worth mentioning (even though it’s obvious)—a file can’t really be in multiple subfolders (with exceptions, e.g., aliases in Finder and replicants in DT) but it can have multiple tags.

  1. But one can browse groups too… And when you set your groups as tags, they become dynamic (alive) and automatic. Their hierarchical structure doesn’t collapse when you copy it to the other database or export to Finder

  2. These “exceptions” (replicants) work better than original tags: you can always see that this file is not the single instance, you can quickly understand how many other reps you have and where exactly.

Tags?

So, I support @RobH’s question: having Reps, Groups as Tags, Concordance and See also, why do we need ordinary tags? If one needs a “flat” view - just use smart folders and see flat list with dynamic (automatic) tags.
Really, I’m trying to understand the value of ordinary tags…

Concordance-based tags aren’t exactly the same as a smart group that looks for text or phrases, though I doubt the difference matters much at scale.

And these are automatically created, though I suppose the same could be done with smart groups or replicants somewhere.

Personal preference, I guess. I like being able to look at a set of tags about things important to me and tap certain topics just to explore the (automatically added) files inside.

I would agree with that. Additionally, tags are compatible with DEVONthink to Go, while smart groups, see also, etc. are not.

DTTG may be a good point though
Maybe automatic concordance and see also tags could be another way to see the data. Have to try it out.

Can AppleScript differ ordinary and group tags?

Not having “See also” in DTTG seems to be the best reason (for me). I hate to admit it, but I haven’t really tried to use the “see also” function in DTTG, as I typically know where I need to go for certain info. Or I search. But desiring to access related documents and files, it makes sense to employ a tagging system, even if it’s simply using the groups as tags feature.

Returning this thread to ask: @bluefrog / @cgrunenberg: any chance we’d ever see a See Also & Classify implementation that features tags?

I’ve shifted gears on how I’m organizing everything and tagging has become a lot more important to me. The auto-tagging script I used previously is now too noisy for this new approach. So, I’ve been manually “sorting” into tags (as advocated by others here!) but it’s a bit tedious. It’d be amazing if the “magic hat” of olde could help identify tags that include similar documents to the selected one. Essentially the same as Move To, but Add Tag instead. Manual selection is then augmented by DT’s intelligence—it wouldn’t be up to my tired brain alone to think of every related tag when I’m organizing the thirtieth document in a row.

A chance? Most things have a chance. :stuck_out_tongue:

However, Development would have to assess the feasibility of this.

1 Like

Meaning that See Also & Classify should suggest tags too or that they should use tags to improve the results?

1 Like

The first one! I imagine it working no different than how See Also & Classify suggests groups to sort into, only it suggests potential Tags to add to the selection.

There might not be an easy way to do the same one-click sort functionality—I suppose that would depend on what level of similarity score is sufficient for when a tag is a viable addition. But even without a one-click “add these tags”, seeing a set of potential tags with similar documents would be very helpful.

Ryan,

Going back to my previous message. Recommend the tags that are on similar documents. I tend to use your script as first page, then the concordance are as a second pass. But the time it takes to properly tag documents is important.

1 Like

I would like to vote for this too.

Auto-sorting is one of the great features of DevonThink but I don’t use it much anymore as I’m switching my documents to tagging.
As I understood, folders and tags are internally identical - so apart from the question of usability it hopefully should “not be to hard” to implement? The future is tagging.

1 Like

The future is tagging.

Perhaps, yours is. This is not a universal thing. Just something to bear in mind.

1 Like