Search engines that hide links (plugin/scanner?)

I’m having trouble building a search plugin for an internal search engine at work, but the plugin isn’t returning any URLs, even though the search URL is right and I’m pretty confident results are coming back. Both testing the plugin in the plugins window and running a test search from a new search window return 0 links.

(Note: for privacy reasons, I’m going to change the words used the URLs)

The search site is at https://search.company.com
The search URL looks like

https://search.company.com/search/all?q=_agentQuery_&start=_agentOffset_

The results pages work kind of like google in that they return links that go back to the search site, with a url=... component that has the actual URL of the result. There’s an added data-url parameter that also has the URL:

<a href="/click?...&url=https%3A%2F%2Fwiki.company.com/..." data-url="https://wiki...">

So I built a plugin (below) run the search, but It doesn’t return any URLs. I looked at how the google plugin is built, so I decided to try using the ParseLinks key, but that’s still not working. I also tried using LinksMatching and/or FollowLinks as you can see below. The plugin is failing with or without these two keys.

Is there anything else I can try? Are there other ways to make DEVONagent recognize that these are the links it’s looking for? Should I write a scanner? (I can’t find any documentation on how to build custom scanners.)

Here’s the plugin as it exists today…

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Description</key>
	<string>Internal Company Search Engine</string>
	<key>EngineUrl</key>
	<string>https://search.company.com/search/all?q=_agentQuery_&amp;start=_agentOffset_</string>
	<key>FollowLinks</key>
	<true/>
	<key>Identifier</key>
	<string>search.company.com</string>
	<key>Info</key>
	<string>Corporate Search</string>
	<key>LinksMatching</key>
	<array>
		<string>*click*</string>
	</array>
	<key>LinksStart</key>
	<string>&lt;div id="SearchResults</string>
	<key>Name</key>
	<string>Internal Corporate Search</string>
	<key>OffsetPerPage</key>
	<integer>1</integer>
	<key>Operators</key>
	<integer>50</integer>
	<key>FollowLinks</key>
	<true/>
	<key>ParseLinks</key>
	<true/>
	<key>ResultsPerPage</key>
	<integer>10</integer>
	<key>Start</key>
	<integer>0</integer>
	<key>Version</key>
	<string>1.0</string>
</dict>
</plist>

Does it work after removing this? In addition, does the search work if JavaScript is disabled?

does the search work if JavaScript is disabled?

I tried disabling JavaScript via the web menu. No change in behavior. JavaScript does seem to be required by the site: I tried browsing the site manually and going to the root only loads minimal HTML with refs to some JavaScript.

Does it work after removing [the ParseLinks key]?

Nope, no change. I’ve tried with or without ParseLinks and/or FollowLinks (4 combinations) with no change in results.

Your questions got me thinking. I tried loading the search URL at the UNIX shell with curl and got a redirect to our internal authentication system. When I load the same URL in DEVONagent’s internal browser window, this doesn’t happen; I get the search results just fine. My guess that the browser is picking up my authentication cookie somehow, so I tried adding setting HTTPShouldHandleCookies to True, but that didn’t appear to make any difference. I also tried forcing an authentication cookie refresh inside DEVONagent, also with no change.

Does the engine require a certain referrer or blocks unknown user-agents?