What is the best way to clip web pages?

There are a lot of ways to clip web pages. I’d like to stick with one that works all of the time. What is your experience with them?

So far I’ve noticed a couple of things:

–the web archive option usually gives a nice page, but sometimes it’s blocked by a pay wall even if I’m already within the garden.
–the “remove clutter” option often does a fabulous job of decluttering, but once in a while cuts back too much. This is not apparent at the time.
–some webpages are already just a pdf file, and I wonder if there is a quick way to clip the underlying pdf files without having to Save As pdf first.

Thank you for your advice.

I usually don’t clip a whole site, only a selected area by dragging. Dragging into a DEVONthink window results in RTF which I don’t like so I drag to DEVONthink’s icon in the dock which results in a webarchive.

Dragging has several advantages:

  • You have to choose what to clip
  • You don’t get clutter (most of the time)
  • You clip more but smaller chunks from one site (at least that’s what I do) which later makes retrieving information very easy.

FWIW I’m currently clipping a lot of structured stuff (i.e. everything has to go into a predefined group structure) and got tired of the group selector (in Preferences > Import I had Destination set to Select group) so I changed Destination to Global Inbox and created a Smart Rule which moves a newly imported webarchive to the current group. Now I can simply open the desired destination group and clip without the group selector. Additionally the Smart Rule script selects the record so I can check whether everything’s ok without switching between Safari and DEVONthink all the time. This makes a very smooth clipping process.

The Smart Rule:

  • Search in: Global Inbox
  • kind:webarchive
  • On Import

Note: Obviously you have to remove the On Import trigger once you’re done with a clipping session.

-- Smart Rule - Move record to current group and select it

on performSmartRule(theRecords)
	tell application id "DNtp"
		try
			repeat with theRecord in theRecords
				set theMovedRecord to move record theRecord to current group
				set selection of think window 1 to {theMovedRecord}
			end repeat
			
		on error error_message number error_number
			if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
			return
		end try
	end tell
end performSmartRule

2 Likes

Amazing! I will need to look at it more carefully. One quick thing caught my attention. What did you mean

?

This thread Apple deprecates WebArchives - what does this mean for DEVONthink?

1 Like

Interesting, I did not know you could drag a portion of a page. I certainly learned a lot here, thank you.

I think I will continue to use webarchive for now (whether by dragging or the whole page).

Any thoughts on what to do when the whole page is already a pdf and you want to capture the whole thing? Currently, what I do is Save As pdf to Downloads folder, and then drag into DevonThink. Is there a quicker way?

I use Clip to DEVONthink with paginated PDF.

1 Like

You could either use *Clip to DEVONthink" mentioned by Pete or use the Print Menu instead of Save because then you will find Save PDF to DEVONthink 3 which allows you to skip the intermediate step with the downloads folder.

Which method you choose depends on your needs and on the respective webpage’s behaviour/structure. Note that the Print method respects the on/off status of Safari’s Reader View.

1 Like