The [ and ] are special characters in RegExes. You must escape them to match literally.
Try your RE in regex101.com. That tells you what matches where (or doesn’t).
Thanks again for your input. However, it’s still not working i.e. there appears to be no match.
I will set the IndexRawMarkdownSource hidden preference to off, rebuild the database since otherwise it doesn’t take effect if I remember correctly, and then try again.
After setting IndexRawMarkdownSource to off and rebuilding the database, these are the findings:
The URI is still unexpectedly not captured via ([\S\s]*) even though it is in Regex101
While the Regex variations kindly provided by different users in this thread work in Regex101, they do not seem to work via the smart rule “Scan Text” action
But you do want DT to index it, so why set it to off?
Your RE says “get me between zero and any number of space and non-space characters”. That’s equivalent to (.*). Something you should always be wary of.
Yes. And no: the RE matches everything. Your whole text. If you use This is some text. And this is the [item link](brain://abc123) followed by more text. as Test string in regexp101., you’ll see that it matches the URI. And everything else.
This ](.*?:\/\/.*?)\)
captures your sample URI in the first capturing group. It should work in DT (and is obviously less complicated than the one you posted in your screenshot. \1 should give you access to the captured URI. Try showing it with alert or notification in your smart group. If that works, the issue is not with the RE, but with the custom meta data field (we don’t know how that’s defined, for example). DT sometimes does not behave as one would expect in that context.
I have not tried to use the “Scan Text” smart rule action with RegEx on markdown before. I just tried enabling IndexRawMarkdownSource and creating a new document with links after that. It also doesn’t work for me. It does work with things that are not links, like HTML <tags> — actually, I don’t seem to need the Hidden Preference for that. But for [links](https://) it only shows [links]. I didn’t rebuild the database though, because I don’t want this currently, so I don’t now if that is the reason.
What I can quickly find in the forum seems conflicting.
Here Bluefrog says this (march 2023):
But september 2022 he says this, specifically about the “Scan Text” smart rule action:
Does that still stand? Is the “Scan Text” a case where IndexRawMarkdownSource has no effect?
My question is why are you approaching this with RegEx? And are you expecting only one URL per document since it’s not going to match and return more than one?
Here is a simple smart rule (and a bit more verbose) example…
on performSmartRule(theRecords)
tell application id "DNtp"
repeat with theRecord in theRecords
set src to source of theRecord
set documentLinks to (get links of src)
if documentLinks is not {} then add custom meta data (item 1 of documentLinks) for "External Link" to theRecord
end repeat
end tell
end performSmartRule
But again, I don’t know your expectation in terms of how many links are in the document, etc.
I was thinking the same. @AW2307 you don’t give many details. How do the documents in question look in real life? What is the bigger goal?
Maybe it is not possible to achieve the bigger goal in this specific way, but there might be other ways to go about it. DEVONthink can do many things. Maybe a script, or something else.