Quantifying heatmaps

The answer to this may well be no, but is there a way to derive some kind of exportable numerical value from the search scores that display as graphical heatmaps in the results in the item list?

I’m getting interesting use out of the ranking of Boolean search results within a corpus of documents, and the relative scores of items in the ranking – such as when some items at the top are clearly very strong matches and then there’s a run of visibly much weaker ones. But what I can’t figure out is a way of doing is putting numbers to the scores; I can copy and paste the list to save the ranked order, but I haven’t come up with a way to quantify the relative scores except in a very basic single-word search (where in principle I can at least look up the Concordance stats for that word in each document).

What I’d ideally like to be able to do is to search, for instance, for a group of terms (say “pet OR dog OR cat”) and translate the graphic into a number for each matching document in the ranking by score. I don’t need or particularly want to know what the number means; obviously the heatmap graphic is an elegant representation of the score from a complex proprietary algorithm that I’m happy to leave as a black box. But if anyone has any bright ideas for how to get numerical data out of it, short of counting the pixels of different colours on the screen, I know that person will be somewhere in this forum…

See the score property of the record class in the AppleScript dictionary.

1 Like

If I am reading this correctly that you don’t care what the number means, then what is the point?

Fantastic! Thanks so much – I’d never have found that on my own. This forum is the greatest!

Fair question; all I need is a numerical value I can put to relative scores in search results, without needing to unpack the underlying algorithm that’s producing that value. A mere ordinal ranking isn’t telling me anything about the relative scores of successive items, nor can I compare the results of the same search when run on different sets of documents. A numerical score is a potentially very powerful resource for simple lexical and stylometric kinds of analysis; it’s not a task DT was particularly designed for, but I’ve been finding it very useful for quick and dirty corpus analyses and would like to try pushing this a bit further.