Web archive vs HTML

kafene · June 3, 2011, 5:26pm

When I’ve tried both, they seem to view the same in DT.

Is Web archive for downloading the page as an archive I can view offline later?

What is confusing me is that both HTML and Web archive both seem to have a progress bar running at the top as if loading the page each time.

And if HTML is basically bookmarking the page, what’s the point of the Bookmark feature?

Confused here. Can someone shed some light on this for me?

Thank you.

Bill_DeVille · June 3, 2011, 7:54pm

A bookmark only holds the page’s URL and name. It has no other content.

An HTML capture of a page contains the text and hyperlinks of the page, but doesn’t download the images, which can only be viewed if online.

A WebArchive capture of a page contains the text, links and images of the page and the images can be viewed when the computer is not online. But if online when a WebArchive is displayed, it will go to the source page, especially to download dynamic graphics such as many ads.

If you use the Safari or DEVONagent browser, or the browser built into DEVONthink and your primary interest is the information content of Web pages and not their layout, you might consider using one of two Services that capture selected content from a Web page. To do this, select the portion of the HTML page that contains the important material and press the keyboard shortcut ‘Command-)’ to capture it as rich text, or ‘Command-%’ to capture as WebArchive.

The advantages of capturing only a selected area of a Web page are two-fold. Because extraneous page content such as ads are not captured, the focus of searching and See Also on the document in your database is improved. And the file size for storing the captured content is significantly lower, especially for rich text captures.

Revearti · July 28, 2014, 11:43am

Thank you for explaining this. I was wondering why there was a progress bar for web archives in DT.

Revearti · February 15, 2015, 6:02am

If you visit a web archive, reload it, and the page no longer exists, will it overwrite the web archive you had with an empty page?