Web Archive Offline Default

Hi all,

I’m evaluating DT Pro and I’m making use of the importer to save some blog posts and articles as Web Archives and it’s working great. The issue I’m having is that clicking on the web archives in DT causes them to load from the Internet again and this can be slow. It appears that DT is by default attempting to visit the URL they point to and simply download them again, which is too slow. Can I make DT by default just load the ‘offline’ archived version instead? If I disconnect my network connection, it uses the ‘offline’ version and it’s quite fast.

Thanks!

This is pretty heavily asked and answered, so make sure to avail yourself of the search function of the Forum.

But many, many, many web sites use dynamically loading content nowadays. It’s not hardcoded into the underlying HTML. So, you can either let the archive function as they designed it… and pull in whatever content it wants to. Or you can turn off the netowrk connection and possible be missing images (or even body content!). This is a web design (and them trying to market to people constantly) problem, not a DEVONthink one.

PDFs don’t exhibit this issue.

Hi BLUEFROG,

What I’m interested in is not having the archive make requests out to the Internet. I save them while online to DT and then after that, I would prefer if they simply never connected again and acted as a static offline copy forever. Is that possible?

Thanks!

That’s up to the Web page designer, not you or DEVONthink. You will often encounter issues of active loading when capturing as WebArchive,

Is it possible to filter certain types of requests when loading a web archive? I often encounter requests to download 4 or 5 MB font files when accessing previously-saved Web Archives from within DEVONthink Pro Office…

I realize this may come down to decisions made by the site’s authors/publishers, but from a security point of view, I can easily imagine wanting to disable, say, all JavaScript or any resources being pulled-in from third-party domains (i.e. ad networks).

What is DevonThink doing differently than a service like Instapaper, which seems to have no problem saving lengthy well formatted web articles with no need to reconnect to the Internet for dynamic content after the initial download?

The difference is that DEVONthink uses the web archive format (and all its flaws) introduced by Apple/Safari.