DEVONthink and RSS - How to save full article

Hi I am new to Devonthink and am just playing around and try to figure out if I can use it for my project.

What I need to have it do is to:
monitor RSS Feed → Load full article → Save it to PDF → Import into Devonthink Database.

I figure out how to have it monitor the RSS Feed, but I am stuck there because it does not load the full feed (the link that is in the short feed).

Do you know if there is any way for me to use DT do this?

Best regards
Bjorkollur

Welcome @Bjorkollur :slightly_smiling_face:

“Full feed” vs. “short feed” is confused terminology, there is just one feed.

DEVONthink does download/parse the (full) feed. But many feeds don’t include the full text of articles, because the owner wants you to open their website. Often because of audience metrics or revenue from ad impressions.

Some dedicated RSS readers have options to get around this. For example NetNewsWire, which can use Safari’s reader mode to download and display the linked page in full. Maybe this is what you have in mind.

I don’t keep feeds that only serve an excerpt in DEVONthink. Your question made me dig a little.

According to this thread: Question: RSS feed Articles (read more) scraper, changing the Feed Format to something other than Automatic downloads the original web page, like NNW’s reader view. That should get you what you want, as long as the URL of the feed item links to the original page. Just set the format to PDF (One Page) or PDF (Paginated).

I’ve never played around with changing the Feed Format, because I thought this was a Global setting only. I somehow missed that you can change the format for individual feeds in the Info > Generic inspector :smiley:

4 Likes

While it is correct you can change the default feed format for RSS articles in Settings > RSS, two things to consider…

  1. The article isn’t magically coming in as a PDF. Instead it is converted via downloading and conversion just as converting a bookmark to PDF or clipping web content. This takes more time and consistent bandwidth.

  2. Every article for the feed (or for all feeds if set globally) are going to be converted, whether you want it or not. Unless you have a very specific requirement and feed, it’s generally a good idea to let feeds come in under Automatic and convert the ones you want.
    If you had some specific criteria, it would be possible to convert matching articles via a smart rule.

3 Likes

Thank you for clarifying. That’s pretty much what I got from the thread I linked. And I see how if you’ve got many feeds, or a high-volume feed, it’s likely a bad idea to change the feed format to something other than Automatic.

I haven’t found a pattern I’d want to automate yet in any feed. I just convert manually. But the standard Convert command isn’t that useful if the feed item is only an excerpt, as it doesn’t download the linked page.

I just tried using the built in script Download > As PDF Documents (Paginated) on a feed item. Something like that is what you want to use for excerpts. It’s a relatively short script, so it should be easy enough to adjust it to work as a smart rule script.

This adapted version works fine for me:

-- Download URLs as PDF documents (Paginated) [Smart Rule ver.]
-- Created by Christian Grunenberg on Mon Mar 23 2009.
-- Copyright (c) 2009-2014. All rights reserved.
-- Adapted for use in Smart Rules by troejgaard on Wed Feb 5 2025

on performSmartRule(theRecords)
	tell application id "DNtp"
		try
			show progress indicator "Downloading..." steps (count of theRecords)
			repeat with theRecord in theRecords
				set theName to name of theRecord
				set theURL to URL of theRecord
				step progress indicator theName
				if theURL begins with "http:" or theURL begins with "https:" then
					set theParents to parents of theRecord
					set theCopy to create PDF document from theURL name theName in (item 1 of theParents) with pagination
					repeat with i from 2 to (count of theParents)
						replicate record theCopy to (item i of theParents)
					end repeat
					set creation date of theCopy to creation date of theRecord
					set comment of theCopy to comment of theRecord
					set rating of theCopy to rating of theRecord
					set label of theCopy to label of theRecord
					set state of theCopy to state of theRecord
				end if
			end repeat
			hide progress indicator
		on error error_message number error_number
			hide progress indicator
			if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		end try
	end tell
end performSmartRule

For a different output format, change this line:

set theCopy to create PDF document from theURL name theName in (item 1 of theParents) with pagination

(look at the built in Download scripts for examples)

Some pages can be quite big, as they include a lot more than just the article. In that case you might want to use with pagination and readability.


If you’re new to DEVONthink you might wonder how you’re supposed to use this. Here is an example:

See the sections Automation > Smart Rules and Batch Processing and Smart Rule Scripts in the manual. For details on different smart rule components, see Appendix > Smart Rule Events and Actions.

You can save the script in ~/Library/Application Scripts/com.devon-technologies.think3/Smart Rules and add it to a smart rule as an external script—useful when used in multiple rules—or you can add it to a rule as an embedded script.