Keywords on Bookmarks?

I have a database that is mostly full of bookmarks (Web internet location). The bookmarks were creating by dragging and dropping a URL from the browser URL field into the database Inbox, from multiple browsers (DT Browser, Safari, Chrome, DuckDuckGo…). Many of those bookmarks have keywords show up under the properties tab. From that tab I can use the Convert Keywords to Tags action and it works just fine. However, I’d like to do that automatically or in a batch process, but I can’t figure out how to make that work. I tried the Smart Rule that I found in the Forum, but that rule doesn’t return any hits… it’s like it can’t see the keywords. Help?

That is the search I tried, replacing Kind is PDF with Kind is Bookmark

The keywords should belong to the metadata property of the record. If that’s the case, a script can do what you want. Something like

(() => {
  const app = Application("DEVONthink 3");
  const records = app.selectedRecords().filter(r => r.type() === 'bookmark');
  records.forEach( r => {
    const metadata = r.metaData();
    if ('kMDItemKeywords' in metadata) {
      metadata.kMDItemKeywords.forEach(k => 
        !r.tags().includes(k) && r.tags().push(k));
    }
  })
})()

Notes:

  • The code is in JavaScript and operates on the currently selected records
  • It ignores anything that is not a bookmark
  • It’s not a good example for legibility and should be improved
  • It has not been tested at all. Use at your own risk and try it first with copies of your data.
1 Like

Actual URLs would be helpful. There aren’t keywords on all sites.

https://www.youtube.com/watch?v=KhAcha34OT4 is one example.

I’m talking to development about a possible bug in automating this.

The information shown in the Properties inspector is retrieved live from a previewed bookmark. But the bookmark item on its own doesn’t have any metadata.

1 Like

In that case, my script is useless. One could get at the HTML from the bookmark and extract the keywords from it, though. By scripting, I mean.

That website has the following keywords:

lego moc,how to build a good lego moc,lego city moc,lego star wars,lego alternate build,lego mario,lego 2023,lego stop motion,bearded brix,star wars moc lego,star wars moc build,star wars moc ideas,lego moc ideas,lego moc idea generator,m&r productions,shytimeismytime,brickmania,lego ww2 moc,lego tutorials,lego creator,lego tutorials on how to build,lego tutorials easy,lego moc tips,lego moc building,lego moc star wars,lego moc tutorial,star wars

You really want such a SEO spam to be converted to tags in your database instead of reducing it to lego, moc, starwars?

In that case, here’s a script that does it (and this time it is tested, at least with this bookmark). But I strongly suggest not going down that rabbit hole – people stuff all kind of things in their keywords for SEO purposes (out of ignorance, Google certainly doesn’t like this spam) and you end up with an unmanageable and useless number of tags.

Here goes:

(() => {
  const app = Application("DEVONthink 3");
  const bookmarks = app.selectedRecords().filter(r => r.type() === 'bookmark');
  bookmarks.forEach( b => {
    const r = app.createWebDocumentFrom(b.url(), {in: app.currentDatabase.trashGroup()});
    const metadata = r.metaData();
    if ('kMDItemKeywords' in metadata) {
      const tags = b.tags();
      metadata.kMDItemKeywords.split(',').forEach(k => {
       if (!tags.includes(k)) {
          tags.push(k);
        }
       b.tags = tags;
      })
    }
  })
})()

Since we’re approaching Xmas, I’ll provide some explanations, too

  • The script loops over all currently selected records and weeds out anything that is not a bookmark (filter( r => r.type() === 'bookmark'))
  • It then loops over these bookmarks (bookmarks.forEach(b => {…})
  • In every iteration, it gets the HTML document for the bookmark and “saves” it in the current database’s trash. Thus, it doesn’t clutter the database itself (and I was quite surprised that it’s possible to create a record in the trash :wink: ). This new document is r in the script.
  • To make things easier, the code then saves the bookmark’s current tag’s in a local variable tags.
  • The script gets the metaData property of r and checks if it contains a kMDItemKeywords entry
  • This entry is not an array (“list” for AppleScript afficionados) but a string. So, the code uses split(',') to convert that string into an array.
  • It then loops over the array elements (which are the keywords) with forEach(k => …
  • For each element, it checks if tags already contain that value (tags.includes(k))
  • If that condition is not met (!tags.includes(k)), it adds the keyword to the tags (tags.push(k)).
  • Finally, it updates the bookmark’s tags property with the content of the local tags array.

Edit: The original code used a different approach to set the tags which didn’t work as expected and was overly clever.

Note that this code is slow: Downloading the HTML takes time, as takes creating the useless record from it. Perhaps @cgrunenberg can come up with a better way to solve this problem.

1 Like

Well, there will be a better way in a future release :wink:

1 Like

Thank you for this (both the script and the explanation)! I very much appreciate it.

" You really want such a SEO spam to be converted to tags in your database instead of reducing it to lego, moc, starwars ?"

My problem is that I’ve built up a huge backlog - I probably have a few hundred links. Rather than go through and individually tag them all now, I would like to get all the key words to tags and then go down the tag list and delete all the tags I don’t want. Still a pain, but much less of one.

So, you want to automate the generation of a heap of tags that you then have to weed out manually? That sounds weird to me.

Take the script as a starting point. Modify it so that it figures out the three or four most frequently used words for each keyword list (like Lego, moc, Starwars in this example). Use those as tags. That’s what automation is for. Not for transforming one menial task into another one. Imo.

1 Like