Ways to get data into DT from web pages

Recently, I’ve been putting a lot of web data into DevonThink. There are a lot of different ways to do this, and I find myself switching between methods without really knowing why. A little bit of reflection led me to this summary of the various ways and their advantages and disadvantages. Any other ways I’ve missed? And what way do people prefer or advise?

  1. “Print” to DT:
    Advantages: fairly quick; DT prompts for place to save the page
    Disadvantages: doesn’t always allow you to name the page; URL is inserted in Comments but not in URL; takes a lot of disk space

  2. Use Print dialogue to Save as PDF, choosing as location a folder with a Save to DevonThink script attached
    Advantages: fairly quick; DT prompts for place to save the page; allows you to keep track of things in the Finder; redundancy makes for extra safety if something happens to DevonThink
    Disadvantages: very wasteful of space (basically 2 PDFs); doesn’t allow you to choose the location in DevonThink.

  3. Select and Copy text, then use Dock menu…New with Clipboard to make a new RTF.
    Advantages: fairly quick; DT prompts for place to save the page; takes little space
    Disadvantages: doesn’t always work with multi-column web pages.

  4. Drag URL into desired place in DT database.
    Advantages: fairly quick if database is visible; easy to choose where to save; takes less space than PDF
    Disadvantages: if you need to archive the text, then process is very slow (need to choose archive, then delete the URL); entails fiddling with Expose or moving windows around if many things are open; takes more space than RTF/text methods

  5. Select text and choose Make New Rich Text from the Services menu.
    Advantages: probably quickest way; takes little space
    Disadvantages: doesn’t allow you to name the page; can’t choose location.

You left out the scripts, such as “add page from Safari” that saves as a web archive (I think) and other variations on saving items from the webpage using the scripts. There are also the Automator workflows and droplets.

ChemBob

Good job of breaking down the various categories. I mostly need the text, not images, from web pages, so I frequently use your last choice-- the services menu. Even added a keyboard command to make it even faster. You’re right that the drawback is that you can’t name the file or designate its destination on the spot, which is why some of us have asked Devon to make that ability part of a future version, even if it’s only an option, like holding down the option or control key when you invoke the services menu command in order to bring up a dialog box that would allow you to instantly specify where to save and under which filename. But that’s another thread…

What drawback are we talking about? Automator to the rescue:

[XXX] -> [New Text Record] -> [Set Current Group] -> [Move Records to Current Group]
where:

  • XXX: an AppleScript action that gets the selection from the program you want to copy from (you will have to write this yourself)
  • New Text Record: you can set the title here
  • Set Current Group: you can select the group here

By dropping it in the application’s Scripts menu (see the manual) with a special name to designate the keyboard shortcut, all you’d have to do is select the text in your app, go to DT Pro and run the command.
I may even add an action for the next release of DEVONagent that will get the current selection as Text. :wink:

What I have asked Apple, and you could do that too, is that you can run Automator workflows from a contextual or Services menu and the current selection from every application. Then, you could customise input/output between applications to your heart’s desire…

To follow-up on myself, replace both [XXX] and [New Text Record] with an action with the following contents for Safari (would also work for DA if you replace “Safari” with “DEVONagent”, and “document” with “browser”):


on run {input, parameters}
	
	tell application "Safari"
		if not (exists document 1) then error "No document is open."
		
		set this_url to the URL of document 1
		-- Bug of Safari 1.3, should be "getSelection()"
		set this_selection to do JavaScript "unescape(getSelection())" in document 1
		set this_title to the name of document 1
		
		tell application "DEVONthink Pro"
			set theRecord to create record with {name:this_title, type:txt, URL:this_url, plain text:(this_selection & return & return & this_url)}
		end tell
	end tell
	return {theRecord}
end run

Also select “Show Action When Run” for [Set Current Group].

This code is based on an AppleScript from the distrubution disk image (“Add selection from Safari”) that you could put in the global Script Menu.

I would really encourage everybody to look at our script/workflow examples, because they allow you to customise the way you enter data in DT Pro in so many ways. You could surely find a way that is ideal to the way you work.

Wow, thanks for the script, Annard. I’ve never used AppleScript or Automator, and I like the idea of Devon including a bunch of scripts for some of the functions that have been suggested on the forums; that way, users who don’t want them don’t have to download them. But for cyberwimps like me, it’s going to have to be something that’s simple to just drop into the scripts menu.

Would your script work with DevonNOTE (if the app name is changed in the script accordingly, of course)? Can it be modified to do the same thing with text in Mail messages? A lot of what I clip comes from emails; at the end of every day, I wind up with a bunch of text clippings from various web pages and emails that then need to be filed in the appropriate folder. Would be great to use such scripts so that I can designate which folder a clipped item goes in at the same time I clip it.

Hi Brett,

No, DEVONnote isn’t scriptable so you can’t use it for that application. And Automator is only supported in DEVONthink Pro 1.0.

You can always hack the script so it willl work with Mail or other applications. But not in its current form, it is written for browser apps that support WebKit. We have a bunch of script that deal with Mail, I would look at those.

And: for what you want to do, AppleScript is the solution, so I would really encourage you to try to learn it a bit. In the online help, we have included some good references to help you learn AppleScript (check the Scripting chapter).

Thanks Annard! I tried it out and it works great. I’d like to make this my standard method, but trying to edit the script for OmniWeb I’ve run into problems. This is the error message I get:

(I get the same message when I use “window” instead of “browser”. And using “document” yields the “No document is open” error message.)
Would anyone with a little more familiarity than me with AppleScript care to suggest where the problem might lie? Thanks!
Rick