Which was written for a completely different scenario (and unfortunately never worked very well). It was actually retrieving the embedded images from a HTML source, but this is not what you intend to do – a meta
element is not an embedded image, it is just that – a meta element.
Your new code does not make sense. HTML
is the raw HTML from the webpage. getEmbeddedImagesOfHTML
gets the images from this HTML. Since a meta
element is not an image, an og:image
meta element will never be returned by using this call. If it were (and it is not), there would be no point at all to go over it with a regular expression, because the method already returns the image URLs.
So, your regular expression has to work on HTML
. And as said before, you have to use two regular expressions : one to get the meta
element with the og:image
property, the next one that fishes out the content URL from the match of the first one.
Now, you cannot string to regular expressions together with &&
in the call to RegExp
– that simply makes no sense at all. &&
is a logical AND
, and you’re using it on two strings. What should be the outcome of AND
ing two strings? I guess, since both are trueish, the result will simply be true
, and feeding that to RegExp
will certainly not give you a regular expression. Also, test
will only tell you if there was a match, but not where to find it. But in this case, you need the matching string, so test
is useless.
The basic procedure is this:
- build two regular expressions
re1 = /<meta.*?property="og:image".*?>/
and re2=/content="(.*?)".*?')/
(//
is the equivalent of calling RegExp
);
- run the first one on
HTML
like so const result1 = HTML.match(re1);
- check if
result1
is defined (otherwise there was no match, so no og:image
in the HTML) and if so, continue with
- running the second regular expression like so
const result2 = result1[0].match(re2)
;
- if
result2
is not null, result2[1]
contains the URL of the og:image
I also suggest to write the script as a standalone script (i.e. outside of a smart rule). Put it in Script Editor, turn on its tracing features, select a record in DT and then start the script.
If you feel so inclined, you can of course try to parse the HTML data with AppleScript. This, however, is not for the faint of heart (as HTML parsing in general is not), and I’d strongly advise against. Also, the example you linked to parses an HTML file, which you do not have here. It works with opening/closing tags, which meta
elements do not have. And the example will also probably break with nested elements, like <div>...<div>...</div>...</div>
. It’s bad in many aspects, and you’re better off in my opinion to ignore it. But as I said before, you do not even need a parser here, because two regular expressions are sufficient.