How do I just grab a webpage into DT?

Okay, so say I’m in Safari and have hit upon a webpage with relevant info that I want to keep. How do I just grab it into DT?

Keeping in mind that I don’t want a LINK, but an actual local archive/copy of the info.

There’s got to be a simple way to do it in just a keystroke/click or two.
thanks!
~jason

Jason: While viewing a page in Safari you can:

[1] Select all or a portion of the page and press Command-) to save a rich text note to your database;

[2] In the global Scripts menu, under the heading “Safari scripts” displayed in light font, select Save page to DEVONthink; or

[3] In the same global Scripts menu, select Save Web Archive to DEVONthink.

Oh wow, options 2 & 3 sound totally boss. I could even use MenuMaster (from Unsanity) to assign a keyboard shortcut for those scripts. Or use Quicksilver!

But I’m just not seeing a Safari Scripts section in the global scripts menu. I’ve got a DevonThink Pro section but it only has 4 scripts (e.g., “Copy selection to current group”).

Man, maybe I have to install (reinstall?) teh Safari scripts? I don’t seem to have any that are specific to Safari!

Just run the command “Help > Install Add-ons…” to ensure that the scripts for Safari are installed. By the way, there are two additional options:

  1. Drag & drop the bookmarklets from the downloaded disk image to Sarari’s bookmark manager, e.g. to the bookmarks bar. Then select a bookmarklet to either store a link or a web archive.

  2. Use File > Print > Save To DEVONthink Pro to add a printed PDF to DEVONthink Pro

Okay wow, VERY strange, but I figured out my problem. For some crazy reason the folder ~/Library/Scripts/Applications had its permissions set to be read-only for all but the system. I changed it to read&write for all, and was able to get the DevonThink scripts installed.

and thanks for tips on the other additional methods too!

Alright, so I’ve got the scripts going now, but both “Add page to DEVONthink” and “Add Web Archive to DEVONthink” appear to do the same thing: save a LINK to the page! I tried using both scripts on the same page, then unplugged my internet connection. In DTP, both entries appear identical and both say “no internet connection” when I try to reload them.

Is something not working correctly, or am I not understanding? I want to suck the whole page in Safari onto my computer so that I’ve got it stored locally in case my connection goes down or the site disappears. It could be saved as a PDF or whatever, I just need the info from a page and I need to know that I’ll always have it. thanks

Hi, Jason. The script to add the WebArchive to my database works on my computer.

Even more convenient is Christian’s suggestion that you drag the 2 bookmarklets (found in the DT Pro download disk image) onto your Safari Bookmarks area. Then just click on Archive while viewing the page and it will be saved as a WebArchive in your database (at the top level).

Hi Bill,
Thanks for the help.
Ok so I’ve tried both the scripts and the bookmarklet, and whether I choose “webarchive” or “page” or “bookmark” it always appears to have the same result: a link to the page out on the internet.

Maybe there’s some setting I have to change in DTP? Or maybe this problem is specific to the type of webpage I’m trying to “grab.” Say, for example, I’m at a page that required me to login and surf around or enter some queries to get there (e.g., a listing of search results from some database). If I go the File menu in Safari and do “Save as… Web Archive” I’ll get a nice static local copy of that page saved onto my hard drive. But if, at the same page, I try to add a web archive to DTP, it just results in DTP having a new item that’s only a link to the page. I’m certain of this, because the page in DTP displays as the login screen to the website, asking me to log in again. And if I do login, it just takes me into the website, where I’d have to do my search/query all over again. So nothing’s being actually saved onto my machine.
I hope I explained that coherently! thanks again,
~jason

Beats me! I’ve successfully saved WebArchives from sites to which I have to log in.

Sounds as though your best bet would be to capture PDFs, which is easy (but you won’t have working hyperlinks, unfortunately). Here’s how:

While viewing a page (or any other printable document) press Command-P. When the Print panel appears, click on the PDF button and select “Save to DEVONthink Pro.scpt”, whereupon you will be asked to choose the group into which the PDF will be located in your database.

I finally found a site that resulted in the behavior described by Jason, because of security provisions on the site.

That leaves two options for capturing text and images permanently:

[1] Rich note capture of the entire page or a selected portion. Actually, that’s the one I almost always use, as it let’s me ignore extraneous material on the page. Selected images will be viewable off-line and hyperlinks work.

[2] Save as PDF. Press Command-P while viewing the page. When the Print panel appears, click on the PDF button and select “Save to DEVONthink Pro.scpt”. This results in a permanent capture of the page with text and images, but without working hyperlinks.

I have exactly the same problem. Whenever I try to archive a website I can only access the individual pages when online. My understanding was that if I asked for an archive the whole site and NOT just the current page was archived.

I am getting the current page e.g. the page I was viewing when I archived, to display offline from DTP but not any links on that page to other pages on the same site.

It is the latter that I want to try and archive in DTP.

Please help. Can I achieve this or not?

Thanks

Additional to my previous comment. Bill mentioned:

[1] Rich note capture of the entire page or a selected portion. Actually, that’s the one I almost always use, as it let’s me ignore extraneous material on the page. Selected images will be viewable off-line and hyperlinks work.

I cannot figure out how to rich note capture entire page. Is that in the scripts?

cheers

looks like there are 2 places relevant DTP scripts could be, when in Safari. one place is in the scripts menu (which isn’t enabled globally by default, i don’t think, so you’d have to enable it using Applescript Utility, AND you have to have DTP install its scripts). the second place is in the bizarre SERVICES submenu, which can be found in the application menu (in this case, the Safari menu). there’s a DTP sub-submenu in there.

saving/archiving an entire website is a much bigger deal than just saving one page. i think DTP might somehow make use of SiteSucker, which is designed to do that, but I’m not sure how. i advise caution in trying to save a whole site; just be conservative in the constraints you set or it’s sometimes easy to end up downloading like a billion files.

Instead of selecting a portion of the page, click on the page background and press Command-A to select all.

In browsers that are compatible with OS X Services (but not Firefox), press Command-) to save the rich text to your database.

Or in DEVOnagent or DT Pro’s browser Command-click and choose the relevant contextual option to save the rich text to the database.

Don’t confuse the WebArchive Apple file format, which saves both images and HTML code of a single page for offline viewing, with “archiving” a multi-page Web site to the hard drive for offline viewing.

DT Pro has the Download Manager for capturing multiple pages from a Web site. I would advise that you read the documentation about the Download Manager and carefully examine all of the options that can be set.

It can be tricky to set the options for any given site. Depending on how the site is designed it may be relatively easy to capture the “whole” site, or it may be very difficult to do so without at the same time capturing pages from many other sites as well. That problem exists for all of the utilities such as PageSucker, etc. that are designed to download multiple pages.

Do you wish to download images, Word documents, PDFs and multimedia files to your hard drive, or leave them externally linked? That’s one of a number of options to be chosen.

Personally, I’m glad that my information needs rarely if ever require me to download a complete Web site. It can be easy to do that, but more likely it will be a somewhat frustrating experience, as most sites are designed with other purposes in view than making it easy to download the site. :slight_smile:

Generally my information interests are satisfied with only a tiny fraction of the content of any given Web site, but span many different sites. So I use DEVONagent searches to pull down just the pages that I need.

Once in a while I find it useful to collect together a set of pages into a single linked group. For example, an article in Science Magazine may have links to other articles, some of them in other online publications. Download Manager can’t do that kind of job for me, nor can any other similar utilities. For this I use Acrobat’s web capture feature, which lets me choose which links to follow and include in the collection.

Bill

Thanks I am now able to capture whole page rich text to my DP inbox :smiley: .

Some pages do not seem to work. They come in vertically. i am assuming this is due to page design.

I am now off and running with DP setting up folders etc. I wish I had it years ago.

cheers

Is there an easy way to save a page to a web archive from Opera, my browser of choice?

/Anders

I’m not sure about Opera, but I was able to copy the Safari scripts for saving links and archives to Camino, where they work just fine.