How do I archive a web page?

If the Sorter is active, then the tab should be displayed (see below). If the Sorter is inactive (set to hidden in DTP’s Preferences>Sorter), then neither the Sorter’s tab or it’s drop boxes should appear when you mouse over the edge of the screen. If you have the Sorter active and your tab is hidden, then you have uncovered something that none of the beta testers encountered.

I see nowhere in preferences/sorter where I can set the sorter to active or inactive. The only setting I have there is to set the hot key and to select or unselect “show at login”.

Interestingly, when I select ‘show at login’, that selection does not persist. If I click on another preference window and come back to sorter, it is again unchecked.

I figured out why I couldn’t see the sorter tab, though. It got moved to Space 2, even though in Spaces preferences I have DT Pro set to be exclusively in Space 1. Any tips on how to get the Sorter tab back displaying in Space 1?

Here is what the preference pane for the Sorter should look like. Clicking on the “Hide” button toggles the state of the Sorter, and the button will change to “Show”.

Selecting the “Show at login” option adds the Sorter to the login items for your user account. Could this be a permissions issue on your Mac?

I don’t use Spaces myself, but you could try going to System Preferences and set the option to force Sorter to open in Space 1, or turn off Spaces temporarily which should force Sorter back to Space 1.

The Sorter is set up to show on all Spaces. If that doesn’t happen, there is something wrong with your system or you may be running some tool that does funny things to windows. Run all the usual cleanup routines that we recommend and see if that makes a difference.

Ditto for the setting of the Login Item, through the Apple endorsed OS X functionality this will eventually save into your system preferences. If it doesn’t stick, I would check the permissions on ~/Library/Preferences and its files. And can you add or remove items in the System Preferences > Accounts’ “Login Items” tab?

Is there a way to save a web page (or portion of it) including the layout? It’s easy to archive text, but things like background, colors et al. go missing.

If you are using a Cocoa-based and Services-aware Web browser, such as Safari, DEVONagent or the browser built-in to the DEVONthink applications, you have a number of options for capturing content to your database.

  1. Services: Click on the application name in the Menu bar and select Services > DEVONthink Pro. You will see 6 options there, 5 of which have keyboard shortcuts. Basically, you can capture all or a portion of the page as plain text, rich text, or WebArchive. NOTE 1: The Services option to capture as WebArchive is only in DEVONthink 2.x applications. NOTE 2: Yes, if only a portion of the page has been selected, the resulting capture as WebArchive will include only the selected portion.

Rich text roughly approximates the layout of most pages, including background color, images and working hyperlinks.

A WebArchive capture faithfully represents the layout of the Web page.

But these Services are NOT available in Firefox, which is unaware of your Mac’s Services.

  1. Bookmarklets: Included in the Extras folder of your application download disk image are bookmarklets that enable capture to your database of either a bookmark URL or WebArchive of the viewed Web page. There are instructions for installation of the bookmarklets in various browsers.

  2. Scripts (DT Pro and DT Pro Office): Make certain that the Global Scripts menu is activated on your computer (highly recommended). Launch Applications > AppleScript > AppleScript Utility. In AppleScript Utility, check the options “Show Scripts in menu bar” and “Show Computer scripts”.

Now, when a Web browser is frontmost the global Scripts menu (stylized scroll symbol in the menu bar) will display the scripts available for that browser. Even in Firefox a web document (WebArchive) can be captured; but the process is slower, as the page has to be re-downloaded into WebKit, then captured as a WebArchive – the option to capture only a portion of the page isn’t available for Firefox.

  1. Contextual menu options DT Pro/Office browser and DEVONagent): Contextual menu options enable captures as rich text, HTML page, WebArchive or (in the built-in browser) as PDF.

  2. Print as PDF: A Web page can be “printed” from any browser as a PDF, approximating (for the page width) the layout of the page.

Another FAQ candidate response. :slight_smile:

Thanks a lot, Bill. I’m using Safari, but I’m also still on version 1.5.4 of DEVONthink Pro. Tried the bookmarklet thing, but I don’t really see the difference to a bookmark. If I click it in the database, it still loads the page from the net very time.
I’ll be updating to Pro Office 2.0 soon, so I guess I’ll check the web archive function then.

Bill — Are the scripts for Firefox working with DTP 2 and FF3? I’m having a continuing error when I try to use it: “No browser is open/in front”.

I had this very same question: If I’ve downloaded a webarchive into DT, then when I load it into the entry window it looks like it’s downloading the data from the internet. Is that correct? What if I just want it to use the local copy?

You can erase the URL in the database and still use the content you archived, but it looks like, well, :confused:

So the differerence between a bookmark and the web archive seems to be that the archive functionality indexes the text while in a bookmark just the URL is captured and the text isn’t searchable in the database.
Still not really what I’m looking for, though. Maybe in 2.0, we’ll see.

@ valente: Yes, I tested the script (global Scripts menu > Add web document to DEVONthink) to capture a WebArchive in DT Pro Office 2.0 pb1 with FireFox 3.0.5, under OS 10.5.6. The result was a WebArchive captured into the Inbox of the frontmost open DTPO2 database.

But this is painfully slow, compared to a similar WebArchive capture under Safari, DEVONagent or the built-in DEVONthink browser. Because Firefox is lacking in an AppleScript dictionary, so isn’t scriptable, and is totally ignorant of OS X Services, the script must capture the page URL, transfer it to the WebKit browser, re-download the Web page, then convert the page to a WebArchive in the database.

Which is why I wouldn’t use Firefox if I planned to do a significant number of captures into a DEVONthink database.

@ flex: A bookmark document simply holds the URL of a Web page, and has no content — it doesn’t contain any text or images. When you select a bookmark document (and are online), the result is that DEVONthink’s browser goes out to the Internet and downloads the page. But if you are offline, you can’t view that page, of course.

A Web document (WebArchive) contained in your database stores the downloaded content of a Web page in your database, text, images, links, layout. The content is fully searchable. When you select such a document, whether you are online or offline, you will see the complete content, images and all. Even if the original page on the Web is subsequently removed, you still have it in your database.

It’s also possible to capture a page as HTML (from a script under Safari, or from contextual menu options in the built-in browser or DEVONagent). An HTML file in your database captures the text content and is searchable, but images can only be viewed when online, as they are not captured within the file.

And it’s possible to capture selected text/images as rich text, saved to the database, in a Services-aware browser (but not Firefox). This is my usual capture mode.

Thanks for the reply, Bill. Alas, the script is not working for me with DTP2 and FF3. (Wondering why, since I’m using 10.5.6 and FF 3.0.5 as well.)

Anyway, I’m now mostly using the bookmarklets (although “save text” is not working properly and “save selection” wraps all the paragraphs together; webarchive and bookmark work flawlessly) or drag and drop a bookmark into DTP2 and then capture the content from within DTP2 at my will. For permanent and bigger texts I use the PDF service.

I could consider going back to Safari again; but I really like FF3 and some of it’s add-ons.

I’m using Safari and DTP2. In the services menu under DTP, all options are grayed out unless I select some text on the web page. I want the whole web page though, so shouldn’t I be able to select “capture web archive” without selecting any text?

To capture the entire page as a Web archive, you can run “Add Web Document to DEVONthink” script that is installed when (if) you installed the add-on scripts for DTP. Also, you could install the bookmarklets that are in the “Extras” folder on the DTP disk image. Also, one can select the entire page with Control-A and then use the Services menu to capture a Rich note, but that is not as clean a capture as are the first two methods that I mentioned.

AFAIK services require a selection. At least all the ones I’ve used have.

Related thread:

Helter Skelter Web Page Imports in DT2

This question, and “where are my Sorter captures disappearing?”, are begging for answers in one-stop FAQs!

No. Most do (filters), but some just need a cursor position (for instance, to insert the current date).

For those who like the idea of the Sorter, but only sometimes, observe that if you Ctrl-click (right click) the DT icon in the Dock, there’s an option to show and hide the Sorter. Very handy for popping it out for a particular filing job, and then hiding it away.

Personally, I wouldn’t mind an option for changing the Sorter tab font. I could live with the tab at 9-point, but the default font is too intrusive.

I’ll believe you even without finding any non-selection-based services installed on my system to confirm it. :slight_smile: