DT and Freedom

I use Freedom to work offline without distractions, and assumed DT would work seamlessly. But it doesn’t. I save docs in a variety of formats, usually web archive, pdfs and rich texts. Before, I’d save HTML. I’d expect stuff I’ve saved as HTML not to work offline, but how come some web archives load and some don’t?

The content of a web archive depends on the site you clipped. If the web archive contains dynamic content that is refreshed from a remote server, then it might not load correctly or completely. The quality of a web archive depends on that site’s developer and the architecture and software they used – and whether they understand what they’re doing, frankly. It is difficult (maybe impossible) to predict how a web archive will perform when your machine is disconnected from the internet, and none of this has anything to do with DEVONthink.

If you truly want an offline image of a site, then clip as PDFs. Though, even there you could run into problems – but usually fewer.

OK. that’s frustrating, but thanks.

I don’t get korm’s answer.

Isn’t the archive supposed to be a “snapshot?”

While it is true that many sites are dynamic these days (I run several such sites) - my idea of a web archive is that you take a snapshot of the served content at a particular point in time.

The term “archive” implies that you’re not trying to capture dynamic data, but just archive one (or more) points in time for future reference.

Regardless of whether content is static or dynamic, at the time that the user connects, they receive a fixed set of html, javascript, images, etc that the browser translates into the final display.

Why wouldn’t an archive feature be designed to capture that set so it can be re-displayed?

Capturing live/dynamic links or content is not in keeping with the term “archive.”

It is a limitation of the format itself. (PS: This is Apple’s format, not ours.)

Sure, it’s a “snapshot” but the snapshot contains code that could be activate by clicking on it, or in some cases by loading the archive to view it. If that code is expecting to grab something from a remote server (say, advertising content) and the remote server is not there because your machine is offline, then it’s sometime the case where the web archive stops displaying correctly.

The OP asked why sometimes web archives do not work as expected when the OP’s machine is offline (i.e., when Freedom is blocking traffic). My answer addresses some possible situations – all of which are outside the control of the archive, the OP’s machine, and DEVONthink. I don’t mean to imply that all web archives fail, but some will fail.

Sure, if components are reaching outside to load content that’s true.

But even in that case, there are different kinds of failures - failures to display an image or UI element properly, or a total failure.

I’d argue that the first kind of failure is acceptable, the second is not.

Since the OP’s post wasn’t clear about what exactly “failure to load” meant it is hard to tell which they were encountering.

I just know that in my case, sometimes when I’ve “archived” a relatively static site, and later come back to it, it was trying to re-load images and etc remotely rather than having a true archive. However, I haven’t tested this recently, so YMMV.

QED

Did you “prove” something?