That’s not as simple as it seems. While XPath has an expression for that, there doesn’t seem to be a simple DOM approach (javascript - getElementsByTagName() equivalent for textNodes - Stack Overflow).
And what about multiple appearances of the text – which one are you looking for?
How? Where? Does it precede or follow the text? Is it part of the same text node or somewhere else entirely?
Frankly: a sufficiently tolerant regular expression might be your best choice here, (\d\d[.:-]\d\d.*item)|(item.*\d\d[.:-]\d\d)
Alternatively, in JavaScript you could do something like this (in the browser!):
const itemlist = ['Maischberger','Tagesthemen'];
const RE = new RegExp(itemlist.join('|'), 'i');
const itemNodes = [...document.querySelectorAll('span.title').filter(element => RE.test(element.innerText));
itemNodes.forEach(item => {
const timeNode = item.parentNode.querySelector('span.date');
const time = timeNode.innerText;
})
This code
- builds a Regular Expression from the strings in
itemlist
(Maischberger|Tagesschau
), ignoring capitalization ('i'
);
- gets all
span
elements with a class of title
from the document with querySelectorAll
- converts this
nodeList
into a JavaScript Array with […]
-
filter
s the array for those nodes whose innerText
match the Regular Expression
- thereby creating an array of nodes whose content matches one of your strings (
itemNodes
)
- it then takes each of these nodes and finds the first
span
with class time
with the same parent
- and finally extracts the time from it
So it is feasible, it’s not too much code, but it is in JavaScript, and it needs the browser to run (no DOM methods in JXA). Or perhaps node.js, with the appropriate modules installed.
And that’s off-topic here, too
You know how to search Apple’s documentation, and they have a whole bunch of XML objects out there. Initialize an NSXMLDocument
with an HTML document, get its RootElement
, and throw an XPath at it with rootElement.nodesForXPath('//span[contains(text(), "maischberger")]', error)
. That’ll give you an NSArray
of NSXMLNode
s. For each of them, find the preceeding-sibling
with class date
, extract its content, and you’re done.
You have to repeat that with each item, or you simply grab all span
s with class title
in an NSArray
first and then filter that one for your items (which seems a more practical approach to me) using indexOfObjectPassingTest
with a code block (possible with AS? I have no idea). Edit An XPath expression with the appropriate match
condition (see post here) would fetch all items matching one of your strings.
Yes, it can be done. And it can be done in AS. You have to learn XPath, instead of using CSS selectors. And you have to type a lot more. Suit yourself.