I archive many pages from stackoverflow.com into DTP. Ususally I drag a link to DTP, select the text from the headline down to the last answer and “capture memo”. Then I delete the banners from the resulting RTFD and end up with an RTF. This way, I get small files (few kB instead of MB for webarchives) which can easily be read on the Mac and iPhone without zooming and even be edited.
However, for lots of pages this gets tedious. Any hints how this workflow could be automated?
Instead of saving a whole page as webarchive in DEVONthink, select the text you want on the original Stackoverflow page and use the Take Rich Note service (command-C) directly in Safari.
For the archives you already have, you can use Take Rich Note in DEVONthink.
Otherwise, I’d write a script that inspects the HTML code of the clipping, looking for the element(s) that contain the text you want. It appears that the text you want is inside the element. (You’ll want to inspect your files to determine what element(s) you actually want.) Then you could use, for example, the concepts in this article, and run the shell script from your AppleScript to extract the target element(s) from the clipping and save them as HTML (or add a step to convert to RTF). You’ll get a rich-text approximation of the original.
I was hoping for a scripted solution and your idea sounds primising. Thanks for showing me the direction!