Script: Add tags to records depending on their content

Smart rules can be used to automatically tag records depending on their content. However, that approach requires one smart rule per tag. A more flexible and less smart rule intensive approach is to use a script like the following (JavaScript!)

/* Define a mapping between tags and search terms
     The first entry must be a string, i.e. the tag, 
     the second one (search term) can be a string or a regular expression 
   If the plain text of a DT record matches the search term 
   either as a regular expression or as a string, the corresponding tag is added to the record
*/

const tagToSearchTerm = {
  /* Set tag "Insurance" if "Allianz" is found in the text */
  "Insurance": "Allianz", 
  /* Set Tag "Phone" if "Telekom", "Telecom", "AT&T" or "Bell" is found in the text */
  "Phone": /Tele[kc]om|AT&T|Bell/,   
  /* Set tag "Bank" if "Deutsche Bank", "UBS" or "ING" is found in the text */
  "Bank": /Deutsche\s+Bank|UBS|ING/i,
};

(() => { /* SMART RULE USAGE: replace with 'function performsmartrule(records) */
  const app = Application("DEVONthink 3")
  const records = app.selectedRecords(); /* SMART RULE USAGE: remove this line */
  /* Loop over all selected records.  */
  records.forEach(r => {
    /* get the records plainText */
    const txt = r.plainText();
    /* Ignore records without a plainText */
    if (!txt || txt.length === 0) return;
    const newTags = [];
    /* Loop over all elements of tagToSearchTerm */
    for (const [t, search] of Object.entries(tagToSearchTerm)) {
      /* Check if the search term matches the current record's plainText. Depending
         on the type of search term (RE or String), 'match' or 'indexOf' is used */
      const match = search instanceof RegExp ? txt.match(search) : txt.indexOf(search) > -1;
      /* If the search term matches, add the tag to the list of new tags */
      if (match) {
        newTags.push(t);
      }
    }
    // newTags contains tags for all search terms found in the current record
    // add them to the current record's tags
    r.tags = [...r.tags(), ...newTags];
  })
})() /* SMART RULE USAGE: remove this line */

It relies on a mapping between tags and search terms (defined at the top of the script). The search term can be a simple string or a regular expression. If it’s found in a record, the corresponding tag is added to this record.

Note 1: This is but a simple example. Ideally, it would be possible to define several tags for each search term. For this, a more complicated mapping would be required.

Note 2: As it stands, the script works on the currently selected records. To make it usable in a smart rule, one has to

  • replace the line (() => { at the beginning of the script with
    function performsmartrule(records)
  • remove the line const records = app.selectedRecords();
  • remove the line })() at the end of the script
1 Like

I do similar processing using AppleScript
It works well; easier to code, but no support for regular expressions

edit: I admit “easier to code” is a subjective personal opinion from an English speaking senior

I find the regular expression aspect particularly interesting and promising. Thanks for sharing :pray:

You can actually use regular expressions in AS, but you have to reach out into Objective C land. @pete31 has posted some scripts doing that. It’s a bit more convoluted than in JavaScript, but possible.

I tend to use REs over strings for this kind of task because simple strings are often not discerning enough. In the sample script, “Allianz” would match the insurance company, but also something like “Allianz für den Fortschritt”, which shouldn’t be tagged as “Insurance”. Of course, the REs given in the sample script are mostly just matching simple strings, too – they’d have to be amended to be useful in RL.

Opinions ahead

Well… not so much for me. While JS looks a bit too terse in many situations (and I want to type as little as possible, increasing that look), it is a lot more powerful than what AS has to offer. “Easy” might mean that the perceived learning curve is less steep, and I agree with this – in the case of people sufficiently fluent in the English language. It probably gets a bit less “easy” for people coming from other grammatical concepts.

While that is of course a matter of taste, other things are not: AS lacks a strict grammar. There’s nowhere I can go to look up comprehensive and clear (as in grammar) rules for it. And of course, it didn’t evolve in the last decades. So, compared to other offerings (and I’m not only talking about JS: There’s Python, Ruby etc.), it is sorely lacking in structure and comfort. There are no REs, there are no date functions to speak of, no localization/internationalization (except implicitly and behind your back), no array methods, no clear concept of objects, no introspection, not even a way to get the “keys” in a record.

Yes, it might be “easy” if you’re willing to write a lot of code. In the same sense as Basic and FORTRAN are “easy”.