Problems with Search Set for Reddit

I’ve created a query to search a subdomain of Reddit. The subdomain is what Reddit calls a “multireddit” which is a combination of Reddit “subs” (channels of content). The multireddit has its own web address, which I’ve added to the “sites” list.

The search set I’ve created in DEVONagent for the word “period” isn’t returning any results. The “searching…” text appears, runs for <10 seconds, then returns a blank results page.

If I search the same address using Reddit’s search function for the same term, I get lots of results, so there must be something wrong with my search set.

This is what I’ve setup…

GENERAL

  • default query = “period”
  • secondary query = none
  • follow links = all, 2 levels (also tried ‘off’)
  • language = english (also tried ‘international’)
  • ignore diacritics = yes
  • filter = none
  • scanner = no scanner

ADVANCED

  • title, text, url, keywords, description = yes
  • html & xhtml pages = yes
  • results = all pages

SITES

PLUGINS

  • none (have tried google, but then get offsite results)

ACTIONS

  • default settings

SCHEDULE

  • default settings

Any advice appreciated!

The subdomain doesn’t seem to be indexed by Google/Bing, therefore you have to use crawling and enable following of links (subdirectories, probably more than 1 level)

Thank you! I thought I had tried that, but it’s working now.

I’ve had DEVONagent for a long time, but am just recently getting the most out of using it. Thanks for making this tool.

Hello again. Results are no longer being returned from this search set. The archive is not being filtered, and results should be returning “all pages”, not just new pages.

The search lasts <1 second and the results are empty.

The search set is looking for “period”, following all links, 5 levels deep at this location:
reddit.com/user/MrSpaceman/m/twox_multi/

Many results are returned when using the search field on the site itself. If I manually search the text on the pages using command-F in the DA browser, then “period” is mentioned on the second page, but not the first.

My only guess is that DA is not getting to the second or following pages.

Is there anything I can do differently to make sure it’s crawling forward into the feed, using the “next” link at the bottom of each page?

An example of the url of the “next” link on the first page is:

https://www.reddit.com/user/MrSpaceman/m/twox_multi/?count=25&after=t3_2t2f7t

On the second page, the link is:


https://www.reddit.com/user/MrSpaceman/m/twox_multi/?count=50&after=t3_2t4cuj

Is there a way to tell DA to follow links with wildcards in them? Such as…

https://www.reddit.com/user/MrSpaceman/m/twox_multi/?count=25&after=t3_*

Without specifingy a “Follow Links” term (see search sets editor), DEVONagent uses only “promising” links (e.g. links matching the search term or links next to an occurrence of a search word) to reduce the traffic. Therefore I’d suggest to use * to follow all links.

That’s very helpful, thank you.