Adding URL metadata to files - Meta Woes

By chance, does anyone have a script they can share that will take the URL from DEVONthink and actually embed it into the formatted note that it’s associated with? I don’t necessarily care whether the metadata is written to the kMDItemWhereFroms or kMDItemURL fields of the formatted note.

By way of background, I’m a DEVONthink newbie that prefers to capture web clippings as formatted notes. But it appears that DEVONthink does not embed the website’s URL (where it was clipped from) to the formatted note’s metadata. Instead, it’s recorded to the associated DT3 file only (in the kMDItemURL field).

Naturally, for those of us who prefer to use DEVONthink with indexed/external folders, then this can cause a problem. For example, if you accidentally sever the link between the DT3 file and the formatted note, then the URL may get lost. Also, if you use Alfred or other apps that can search for this metadata, now you won’t be able to find the formatted note directly by its URL (you’ll find the DT3 file instead). While I can understand why DEVONThink needs to keep a lot of its metadata separate from the files they’re associated with, I don’t understand why the URL field is one of them (particularly when coming from DEVONThink’s web clipper).

In any case, if someone has a script they can share that deals with this issue (or even some experience using the xattr command, for example), I’d greatly appreciate it. Thanks a ton!

Or you could learn to script… :thinking::stuck_out_tongue:

The URL is captured with the clipping from the browser, as clearly seen here…

image

(in the kMDItemURL field).

Where are you seeing this?

Yes, it’s possible to script with xattr but it’s also unclear in what way this would be useful in practice.

Also, if this was a non-AppleScripted solution, Development would have to assess the feasibility and broader desire for such a feature.

Perhaps I didn’t do a good job of explaining myself above. I am not disputing whether DEVONthink is capturing the URL. As indicated above, I understand that it’s recorded in the kMDItemURL field of the DT3 file associated with the formatted note. The question was whether someone has a script to embed that URL in the formatted note, too (or another related field, like the kMDItemWhereFroms field)?


To illustrate the problem, I have clipped your post from this page as a formatted note. Then, I moved the formatted note to an external/indexed folder/group called “Articles” (Database: Research, Group: Articles). In the screenshot below, you will find an image of the note, including its inspector pane.

Next, you will find a screenshot of the metadata that’s accessible through Finder’s file information window of the formatted note - which does not contain the URL.

If you’d like to see a more detailed rundown of the metadata associated with the formatted note, please see the PDF below.

Metadata from Formatted Note.pdf (20.1 KB)


Based on this setup, since the URL is not embedded in any of the formatted note’s metadata, when I search for formatted notes using Spotlight metadata, I can’t find the note unless I use DEVONthink.

For example, if I use Alfred to search for the formatted note based on all available metadata fields - such as kMDItemWhereFroms, kMDItemURL, and every other field where you might possibly store this info - and I use the search terms “devontechnologies.com”, I can’t find the formatted note that I’m looking for.

I’d like to see the URL embedded directly into the formatted note. Doing so ensures I can use other search tools to find what I’m looking for outside of DEVONthink. It also helps guard against potential issues that might occur if I make a mistake and errantly severe the tie between the formatted note and its metadata in DEVONthink. And, if I share the formatted note, now others can actually see the URL in the file’s metadata, too.

Given how little I understand about DEVONthink, I’m sure there are use cases where it doesn’t make sense to embed the URL in a formatted note. In this particular scenario, however, where the formatted note is being created from a web clipping and the user has appropriately moved it to an external folder/group that’s indexed, it’s hard to imagine why users wouldn’t want to embed the URL in the formatted note.

As an aside, I threw out the xattr command in my initial post because I have used it in the past - through Terminal - to write a URL into the kMDItemWhereFroms field of a PDF (long before I started using DEVONthink). So, I assumed that it could be used in a shell script via DEVONthink to write the URL field in DEVONthink to the formatted note itself (or whatever format people use). But I’m pretty terrible with shell commands, and was having a tough time understanding how I could get the appropriate url and file path to even run it. In any event, I’m up for any type of solution. Has anyone tackled this - or a related metadata issue - using xattr or another tool in DEVONthink?

Thanks for your help!

1 Like

In case it’s helpful, below you will find one of my many failed attempts to test whether I could write DEVONthink’s URL info to a selected record’s kMDItemWhereFroms field, using the xattr command:

tell application id "DNtp"
	set theSelection to the selection
	repeat with theRecord in theSelection
		set theFilePath to the path of theRecord as string
		set devonthinkURL to URL of theRecord as string
		do shell script "xattr -w com.apple.metadata:kMDItemWhereFroms " & devonthinkURL & " " & theFilePath
	end repeat
end tell

Am I even in the ballpark with this one? I suspect that I’m screwing up the file path portion?

Thanks again!

OK, so I finally figured out the stupid file path / quotes problem, and here’s a working version of the script for others that might want to do the same thing:

tell application id "DNtp"
	activate
	set theSelection to the selection
	repeat with theRecord in theSelection
		set devonthinkURL to (quoted form of (URL of theRecord as string))
		set theFilePath to (quoted form of (path of theRecord as string))
		do shell script "xattr -w com.apple.metadata:kMDItemWhereFroms " & devonthinkURL & space & theFilePath
	end repeat
end tell

The script will take the URL that you see in DEVONthink and embed it into the document’s kMDItemWhereFroms field.

The only catch - and it’s a big one - is that the script only appears to work when the document is located in an external/indexed Finder location. I have no idea why, but hopefully, someone will know how to fix it? :crossed_fingers:

When the script is run on an internal database file, the script doesn’t give off any noticeable errors - at least not to a novice, like myself - but it doesn’t seem to update the actual document either. In other words, it doesn’t appear to mistakenly touch any other metadata. I checked the associated DT3 file, for example, just to see if it was errantly adding that information to it - which, to be clear, is NOT the intended purpose - and nothing appears to have changed (i.e., the DT3 file doesn’t now have the URL in a new kMDItemWhereFroms field now, it’s still only located in the kMDItemURL field). Nothing seems to change.

Hopefully, the script is helpful to others. And, if anyone has any suggestions for how to get it working with internal database documents, please let me know!

1 Like

Spotlight does not index inside any package files so this would never show up in a Spotlight search, even if querying in Terminal.

To clarify, I am not trying to use the metadata to locate the files while they are still in the database. However, it would be helpful to apply the fix before moving the files out to externally indexed folders.

Are you saying that the xattr command can’t write to files that are located in databases? If it can, any ideas what needs to be fixed in the script above?

To clarify, I am not trying to use the metadata to locate the files while they are still in the database. However, it would be helpful to apply the fix before moving the files out to externally indexed folders.

There isn’t anything to fix regarding your script if you’re moving the files out to the indexed folders in the Finder.

Are you saying that the xattr command can’t write to files that are located in databases? If it can, any ideas what needs to be fixed in the script above?

No, that’s not what I’m saying. I’m saying Spotlight doesn’t index inside package files (which is what a DEVONthink database is), so the metadata is written to imported files but Spotlight will not be aware of it. This is easily testable by running the script and moving the file to an external folder.

Help w. Escaping Funky Characters in Shell Scripts?

I noticed a problem with the script that I posted above, which takes the URL from DEVONthink and embeds it into the associated file’s “Where From” field (which is visible in Finder from the “Get Info” panel). Although it works most of the time, it seems to break every when the URL contains question marks “?” or equals signs “=”.

I suspect these characters need to be escaped somehow in the shell command. However, I haven’t been able to figure out how to do this appropriately. Can someone help guide me to a resource for this? This seems like it would be a really common problem, but I haven’t been able to wrap my head around it.

For testing purposes, I’ve included the script again below. And, you can also download it here: Download. It contains a few test URLs at the top of the script, so that you can see how it works with URLs that contain (and don’t contain) these characters.

set testURL to "https://www.thisDoes=Not?Work.com" -- test URL, which does not work due to symbols ? and =
#set testURL to "https://www.thisWorks.com" -- test URL, which works

tell application id "DNtp"
	activate
	set theSelection to the selection
	repeat with theRecord in theSelection
		#set devonthinkURL to (quoted form of (URL of theRecord as string)) -- commented out for testing
		set devonthinkURL to (quoted form of (testURL as string)) -- included for testing only
		set theFilePath to (quoted form of (path of theRecord as string))
		do shell script "xattr -w com.apple.metadata:kMDItemWhereFroms " & devonthinkURL & space & theFilePath
	end repeat
end tell

So, just to be clear, the problem in the above script’s test URL seems to relate to the presence of an equal sign and a question mark (i.e., the so-called funky characters that need to be escaped or altered in some way for the shell command to work)? :man_shrugging:

Lastly, for those who are unfamiliar with the “Where from” metadata field, here’s an example of how it looks in Finder’s “Get Into” panel. As you can tell from the picture, I used this script to embed the URL that does not contain the funky characters!

WhereFrom

Now, I just need some kind soul’s help to get it working with the funky characters! Thank you in advance for any help you can lend!!

This appear to be a Finder issue.

And why do you need it to be in the Info pane?

@BLUEFROG Thanks for taking a look at this.

I don’t pretend to know much about this stuff, but so I can understand things a little better, why do you think it’s a Finder problem?

While I didn’t embed the URLs using this script, I have tons of files on my computer who’s KMDItemWhereFrom fields contain question marks and equal signs (and a variety of other, I’m sure, equally problematic funky characters). For example, here’s a screenshot of just a small handful from one website (via HoudahSpot).

These URLs are also accessible via Finder and through Alfred. Is it possible that it’s a Spotlight problem?

The weird thing is that when I use xattr tool with the p option, as you did in your screenshot. I can also see some of the URLs that I didn’t think were embedded appropriately with my script. However, when I use MDLS … they’re missing (which is what Spotlight is seeing, correct)! I’m stumped :man_facepalming:

I don’t care about the URL showing up in the Info pane. That screenshot was only included for those who might not be familiar with the KMDItemWhereFrom field, and where they might find it. For me, I just want the URL embedded in the file so that I can access it more readily from outside DEVONthink (more detailed explanation above).

Thanks again for all of your help!

@BLUEFROG In the 3.04 update, I noticed that some work was done on how the kMDItemWhereFroms field works (which is welcomed news). According to the Help Page:

The extended attribute kMDItemWhereFroms of indexed or exported items is set to the item’s URL. Additionally, the Download Manager (Pro edition) sets this attribute on downloaded items.

Does this now work for the Web Clipper or RSS feeds, too? :pray: :crossed_fingers:

I would love to stop using my script (which still doesn’t work with “funky” characters). Thanks again!

Only if the items are indexed or exported. And no RSS will not unless you move export or move individual items out of the feed into an indexed location.

@BLUEFROG Thanks for getting back to me.

As for RSS feeds, I have a script that refreshes my feeds and then moves them to the Global Inbox (which you were instrumental in helping me create - Thanks again!). Once the formatted notes are in my Global Inbox, I read them, and if I want to keep them, I manually move them to their final destinations in various indexed folders (i.e., I just drag the formatted notes to the indexed group/folder in DEVONthink). However, when I view these files in Finder, I am still not seeing anything in their kMDItemWhereFroms field (DEVONthink’s URL is not being transferred). Is there some kind of special process that you must do to get DEVONthink to add the items URL to the kMDItemWhereFroms field? For example, does it require the user to run a sync, use the export function, open and close the database, etc.? From what I can tell, simply dragging the file into an indexed folder does not add the item’s URL. Thanks!

Similarly, the same could be said for something captured with the web clipper and then moved to an indexed folder/group.

Thanks again!

The kMDItemWhereFroms may not show in the Finder’s Get Info pane, but they are visible in xattr -l

… and ls -l@


Interesting… as I have no knowledge of the underlying implementation used by Development, here is my conclusion_(though @cgrunenberg can comment further)_:

The xattr is not written to the proper namespace, just as a raw kMDItemWhereFroms.
It must be part of the com.apple.metadata namespace, so…

xattr -w com.apple.metadata:kMDItemWhereFroms 'https://www.devontechnologies.com`

… will show the Where Froms in the Info pane.

_(Incidentally, this was the key to OpenMeta metadata (which I was a part of back in my Ironic Software days). It used Apple’s namespace and Spotlight treated it as proper Spotlight data. (And idea we actually took a lot of flak for :open_mouth: :slight_smile: ).

Before…

After using the Apple namespace…

@BLUEFROG Thanks for digging into this for me. As usual, you’re correct that I can also see similar raw kMDItemWhereFroms data in Terminal.

Since much of this is over my head, I suspect there’s something I’m missing here. But to ask the obvious question: What’s the point of adding the data if it’s not Spotlight/Finder friendly?

Any chances this makes its way into 3.0.5? :crossed_fingers:

For the moment, I guess I’m still stuck using my script (which works for everything except URLs with funky characters). Thanks again for the clarification! And, sorry for interrupting your weekend.

What’s the point of adding the data if it’s not Spotlight/Finder friendly?

For one, it’s not something people make wide use of.
For two, it’s likely just something missed in the implementation.

Any chances this makes its way into 3.0.5?

You know I don’t comment on such things. It’s bad for the both of us. :slight_smile:
And I can’t comment on what Development is up to. They prioritize and schedule on their own, not by my mandates.

And, sorry for interrupting your weekend.

No worries. You benefit from my poor work/life balance, I suppose.

1 Like

I only know escaping characters from using regex in AppleScripts, but I think there should be something similar for shell scripts too (how would other people do such tasks otherwise?).

This doesn’t include the equal sign, but maybe it’s a starting point

2020-02-02_05-14-49

(Terminal Primer – Part 3 – Special Characters – Scripting OS X)

Thanks for your help @pete31! I really appreciate it.

Unfortunately, I was never able to get this approach to work. However, this afternoon I stumbled across another closely related one that does. In short, if you stuff the URL in a temp *.plist (that’s appropriately formatted), you can use it to write the URL to the file.

I’ve adapted their script for DEVONthink, and included it below for others who may wish to do the same thing. From what I can tell, it seems to work great. Hopefully, the will be unnecessary at some point in the future.

Thanks again @BLUEFROG and @pete31 for everything!

tell application id "DNtp"
	activate
	set theSelection to the selection
	repeat with theRecord in theSelection
		set devonthinkURL to (URL of theRecord as string)
		set theFilePath to (quoted form of (path of theRecord as string))
		tell application "System Events"
			set PlistFile to "~/Desktop/kMDItemWhereFroms_tmp.plist"
			set PlistFirstItem to make new property list item with properties {kind:list}
			set PlistXML to make new property list file with properties {name:PlistFile, contents:PlistFirstItem}
			make new property list item at end of every property list item of contents of PlistXML with properties {kind:string, value:devonthinkURL}
			set PlistBinary to do shell script "plutil -convert binary1 " & PlistFile & " -o - | xxd -p"
			do shell script "xattr -w -x com.apple.metadata:kMDItemWhereFroms " & PlistBinary & space & theFilePath
			do shell script "rm " & PlistFile
		end tell
	end repeat
end tell
1 Like