Can't constrain search results when writing custom plugin

I am trying to create a custom search plugin for BASE, the Bielefeld Academic Search Engine.

To do this, I followed the example in the DEVONagent Pro manual, Chapter “Plugin Development”. When testing, step 3 of the tutorial asks to click the “Test” button and review the returned URLs. I got plenty, and as described in step 4, the goal was now to exclude all but the 10 relevant results that were to be expected on the first results page. After using the “Exclude Domain” and “Exclude URL” menus a few times and tweaking the resulting file by hand, this is the array of excluded domains:

<key>LinksNotMatching</key>
<array>
	<string>*base-search.net*</string>
	<string>*lucene.apache.org*</string>
	<string>*maps.google.com*</string>
	<string>*openstreetmap.org*</string>
	<string>*twitter.com*</string>
	<string>*base-search.net/MyResearch/*</string>
	<string>*base-search.net/Record/*</string>
</array>

However, I still see URLs like https://www.base-search.net/MyResearch/Home?delete=77059653afd28f4f31bbceaf60669d3656df05e0716f92b40b9673b4fb29c411&back=recordList in the list of URLs returned during a test. The exclusion does not seem to have any effect. I even added (manually) the following tags to denote “nothing of interest follows here”, as described in the manual.

<key>LinksEnd</key>
<string>name="selectAllRecords"</string>

That should exclude most URLs of the structure given above, but they still show up in the list:

Any ideas what might be going wrong here?

Could you please post the complete plugin so that we could try to reproduce this here? Thank you.

Sure, here’s the plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Description</key>
	<string>Bielefeld Academic Search Engine</string>
	<key>EngineUrl</key>
	<string>https://www.base-search.net/Search/Results?lookfor=_agentQuery_&amp;type=all&amp;l=de&amp;oaboost=1</string>
	<key>Identifier</key>
	<string>www.base-search.net</string>
	<key>Info</key>
	<string>JS_BASE Plugin</string>
	<key>LinksEnd</key>
	<string>name="selectAllRecords"</string>
	<key>LinksNotMatching</key>
	<array>
		<string>*base-search.net*</string>
		<string>*lucene.apache.org*</string>
		<string>*maps.google.com*</string>
		<string>*openstreetmap.org*</string>
		<string>*twitter.com*</string>
		<string>*base-search.net/MyResearch/*</string>
		<string>*base-search.net/Record/*</string>
	</array>
	<key>Name</key>
	<string>JS_BASE</string>
	<key>OffsetPerPage</key>
	<integer>1</integer>
	<key>Operators</key>
	<integer>59</integer>
	<key>ParseLinks</key>
	<true/>
	<key>ResultsPerPage</key>
	<integer>10</integer>
	<key>Start</key>
	<integer>0</integer>
	<key>Version</key>
	<string>1.0</string>
</dict>
</plist>

And, to clarify (perhaps I’m misunderstanding something already there): What I would expect is that, with each entry in the LinksNotMatching array, I should see the matching URLs disappear from the test output shown in my screenshot above.

Thanks for the plugin! I just tried the plugin and get the same results. But it seems to work as expected, the URLs marked with “+” are correct and don’t include base-search.net

Ah, then I think I understand what I did wrong – I expected the filtered URLs to disappear from the list shown above, but your post makes it sound like I just have to watch the status in the “error” column change between “-” (won’t be included) and “+” (will be included). Is that correct? If so, I did not see that info in the manual, perhaps it might be worth adding that info more explicitely on page 87.

Thanks a lot for the info! Then I’ll start fine-tuning now :slight_smile:

This is correct.