Best use for research

I am new to DEVONAgent and want to use it for academic research. I cannot quite understand how to best approach this. Ideally, I’d like to search Google Scholar +my university library + a number of scholarly databases. I typically use Google Scholar which crawls most databases anyway.

I’m struggling to see the difference and how to leverage this tool for all the power it has, so I don’t have to move between multiple systems for researching for school.

1 Like

What field do you study or teach?

I am a doctoral student researching online learning. I want to use DevonAgent to look at the effectiveness of online learning, factors of student success, and comparison between face-to-face and online courses.

My dissertation is on a different topic, outsourcing of online learning.

I’m still a relative neophyte, but I use DEVONagent’s Search Sets. Get to it via Window>Search Sets. You could create a plugin for each of the scholarly databases you want to access, then create a Search Set that uses each of those along with your university library and Google Scholar. Then, make the Search Set run every night (or otherwise consistently), and tell DA to save the results to DT or into an archive.

The most important thing I’ve figured out is that you need to know where to look and what you want to find before you use a Search Set. So, form your research question outside of DA, and then create the Search Set so that you find exactly what you’re looking for.

For one of mine, let’s call it Valcour Overview, I built a Sites list of about 20 sites to crawl or search, and I specified one plugin: Google PDF. If I had access to more of the sites that have plugins (like JSTOR or other paywall sites for which I could create plugins), I’d add those too.

I compiled my list of sites manually based on subject matter & places where I’ve found articles before, and I played with a few search strings to see what worked best (in terms of more than 0 hits and less than 500). The search runs every night, and it saves the results in a Resource format to a DT3 folder.

Boolean searching is where DA really shines, IMO.

This is my search string for this particular search set.
(Valcour NEAR/3 Island) AND ~revolution OPT (Ticonderoga OR Buttonmould OR Arnold OR Champlain OR Trumbull OR boat OR “Noah Hall”)

I have a different search set for a more specific search on the same Valcour subject.

I’m sure there’s an even more efficient way, but this is something that can get you started thinking.

Cheers.

4 Likes

Is there a way to crawl a website without a particular search term? How about if I wanted to download all new pages with a PDF on it?

Welcome @ssheth

Is there a way to crawl a website without a particular search term?

An example, please.

And you can specify PDF documents in a search or search sets’ Advanced > Files.

Here is an example -

I like to read the memos from Howard Marks. I would like to save the memo to DEVONthink every time it gets published on the website.

There is also an option to download the PDF version directly but that is behind a JS link and isn’t directly linked to the PDF.

I have one more example that I’m struggling with. I’d like to capture every new PDF that is put out by McKinsey on their insights blog, and save it to DEVONthink.

How would I set up DEVONagent to do so? Here is what I have so far.

I don’t really understand when to use the crawl function or the search function when I set up the agent either. Could you help explain under what circumstance to use each, please?

The url that this is searching is set to mckinsey.com. As you can see, it is set to search for the Article PDF string as a title of the page. I’m sure there is a more elegant way of doing this search.

Does the website have an RSS feed for those memos? In DT’s Preferences menu>RSS, there’s an option to pull in an RSS feed as a web archive (and other options). If there’s only one memo per feed,…?

Sadly there is no RSS option native to the website.

what about this? https://www.mckinsey.com/insights/rss.aspx (it came up in the log when I searched the site.)

This works - though it doesn’t provide the full text version of the articles :frowning:

You might want to try the scripts in the following thread to automatically get the full content of those truncated feed items:

1 Like

An even simpler solution which works well with many RSS feeds - including those for most major news sources - is to go to Preferences/RRS/Feed Format and choose “Markdown.” The Markdown format for many RSS feeds accomplishes what you are aiming for in terms of pre-downloading most of the article. The format is a bit off and you might therefore chose to Open the original URL in selected cases, but for the most part I find that the Markdown format of RSS as displayed by DT3 works quite well for RSS feeds.

This is great! I’m going to try this solution out and circle back.

1 Like

This setting is also available per feed, see Info panel (“Format”). Works nicely with the following caveat: I don’t know how to force DT to actually switch from, say, “Automatic” to, say, “Markdown” once DT has already downloaded the feed. Which it does immediately after adding a feed. Workaround: enter a wrong feed URL, change Format for that particular feed in Info panel, set URL to feed’s correct address.

1 Like