Is it possible to set up an RSS feed such that link URL in each item is automatically fetched and stored in DEVONthink as either Rich Text or webarchive?
The use case is, for example, automatically saving a copy of each article posted to a news site.
If you set DEVONthink > Preferences > RSS > Remove Articles to “manually”, then the feed will remain intact until you delete the articles. So, you could come along at any time, select a number of articles, and use Data > Convert > to Rich Text on the whole selected block of articles.
korm: Unfortunately, for feeds that only provide headlines or summaries and not full articles, that would just convert the headlines or summaries into a Rich Text document. I was hoping there’d be a way DT could follow the link to the full article and capture it automatically. It appears not without some scripting work.
That’s true, it’s not pretty. I mainly just wanted to collect a searchable text archive over time, though, so it’s not essential that it be an exact replica of the web page.
The script available from the Support Assistant (in the Help Menu): Download > Convert URLs to web Documents might be useful. It grabs the URL from the RSS article and downloads a web archive of that page.
YMMV – every site does RSS differently and the URL in the feed might be a re-direct and not the URL of the actual page you’ll want to save.
You can select multiple articles and run this conversion script.
I’ve tried all of the options, from Automatic to WebArchive – none work, although many new ones have downloaded.
The script to Download as WebArchive etc. works, but changing the preferred format doesn’t work no matter what I do, or whether I do it in preferences or individual feeds.
That script isn’t available for a Smart Rule – not sure why not.
That did the trick. It may have been the computer reboot, but not certain. I also turned off Remove Clutter, which I suspect actually did the trick. WebArchive setting is working beautifully. Thank you.