Add RTFD to DT script?

I have been saving web pages to DT by using the Safari script “Add Page to DevonThink” and then converting then converting the pages to RTFD in order to add the URL and now date for exporting to my files. It occurs to me it would be a heck of alot easier if there was a Safari Script to add Rich Text to DT. All I see is an add Text to DT.

Might I prevail upon you again for a modification of this script so that I can add RTFD directly?

thanks in advance

Transferring rich text from one application to another isn’t possible due to limitations of Mac OS X’s scripting. You could of course use the “convert record theRecord to rich” command but this might crash sometimes due to bugs of the WebKit. But why not just use the Take Rich Note service?

I thought about doing that but the command only seems to be available when selecting part of the web page instead of just the whole page. Of course, I can get around that by Edit|Select All and then invoking the service but it is an extra step and I was hoping to do it all with one command which I could then trigger from Quicksilver.

As it stands, the fastest way I can think of do do this is cmdA to Select All and then cmd) to invoke the service. Not so bad, but I would prefer to have it all run by just typing add which I was doing before to add the HTML. Oh well…thanks.

You might use this script (a slightly modified “Add page to DEVONthink” script):


tell application "Safari"
	try
		if not (exists document 1) then error "No document is open."
		
		set this_url to the URL of document 1
		set this_source to the source of document 1
		set this_title to the name of window 1
		
		tell application "DEVONthink Pro"
			set theRecord to create record with {name:this_title, type:html, URL:this_url, source:this_source}
			set theRTF to convert record theRecord to rich
			delete record theRecord
		end tell
	on error error_message number error_number
		if the error_number is not -128 then
			try
				display alert "Safari" message error_message as warning
			on error number error_number
				if error_number is -1708 then display dialog error_message buttons {"OK"} default button 1
			end try
		end if
	end try
end tell

But the conversion might sometimes crash due to WebKit issues and the display of the resulting rich texts might be slow. Therefore it’s definitely recommended to select the interesting part of the page and take a rich note instead.

Thanks…I will defidntely give it a try. Because of the kind of work I do, I need to archive the entire page because I am not always sure what is “interesting” until later. Therefore, I can’t really clip only part, but lets see how this script works.

Have you tried selecting an entire web page, then using the Take Rich Note service to capture it as RTF(D)? Not sure I understand the difference between doing that vs. using Add page to DEVONthink and converting the captured page to Rich Text. The resulting documents won’t necessarily be identical but certain pages don’t have visually appealing results using either method anyway (RTFD-converted threads on this forum, for instance). Or, maybe those differences are significant for the content you’re capturing/converting?

I am pretty much resigned to the fact that there is no perfect solution. What would have been perfect would have been the adoption of a universal “single page” archive format for web pages such as .mht.

Anyway, lately I have taken to using the “printer friendly” pages where they exist and using the add page as RTF. I will not be able to save the other interesting things that may be on the original page, but the formatting seems to work and, as I said, there is no perfect solution.

Same here. With content diversity some amount of conscious intervention will always remain in choosing methods and formats for capturing it regardless of how automated the process becomes.

Apple still has a habit of arbitrarily using proprietary closed formats for no apparent reason that inevitably cause interoperability frustrations.

Fortunately for me the benefits of saving web archives in DT still usually outweigh the disadvantages. It’s convenient being able to reload and update them (and HTML pages) of certain captured dynamic content. I’ll often use PDF for the graphic-heavy static content that I don’t intend to recapture.

Lately I’ve tended to capture full printer-friendly pages as HTML, although when annoying noise that PithHelmet filters out in Safari show up in the DT version I may convert it to RTF(D) for cleanup. And I’ve always made frequent use of the Take Rich Note service on printer-friendly content.

A bit off-topic… I still have hope that this and other phpBB forums will eventually support printer-friendly thread/post views that can be cleanly archived as RTF(D) in DT. I stuggled with some icky formatting and redisplay problems while editing a few captured posts this morning. It doesn’t take much to remind me of things that really bother me about phpBB from a user’s perspective. :angry:

Sounds like you’ve got a good understanding of tradeoffs with the choice of capture methods and formats for your purposes.