I would like to bump this thread as this type of search engine seems quite common.
One more example, the “aggregate” full-text search engine of New York University:
https://arch.library.nyu.edu/metasearch/results?startRecord=1&group=2011-11-26-001607&resultSet=007246
(please note that this link will not work if you are not authenticated, which in turn requires university credentials)
As you may note, the search engine produces a “group” (in this case #2011-11-26-001607) and “result set” (#007246 in this case). Obviously these numbers are unique for each search query.
The situation may not be hopeless, however, as the above URL is the final URL in a chain of automatically loaded URLs following the user’s entry of the search query. And the first URL in the chain (i.e. the URL immediately produced once the user submits the query) is:
https://arch.library.nyu.edu/?base=metasearch&action=search&context=Music&context_url=%2Fdatabases%2Fsubject%2Fmusic&find_operator1=AND&query=_agentQuery_&field=WRD&query2=&field2=WRD&Submit=Search&database=NYU00370&database=NYU00876&database=NYU00663&database=NYU01302
where I have already replaced the query string with devonQuery
This last URL is perfectly manageable by DevonAgent. However, as I mentioned, it does not lead directly to the first page of search results. Instead, it loads a second (and often a third) page, in which the query string is represented by the “group” and “result set” numbers.
Can DevonAgent handle such situations? If not, could it be empowered in a future release to do so?
In principle, a DA script would need to be able to:
- Follow the chain of automatically loaded URLs (at the moment it seems unable to do so).
- Allow the script developer to insert the agentOffset key in a “wildcard”-containing URL, such as:
https://arch.library.nyu.edu/metasearch/results?startRecord=_agentOffset_&group=????-??-??-??????&resultSet=??????
Thank you.