Web archive option is powerful!

Nor fonts, apparently. Well, the web is simply not meant for archiving.

Actually, for archival purposes, PDFs are more permanent and likely more often used.

4 Likes

I did get your point, already from your first reply.

But it seems I was not able to bring over my point in an understandable way.

Let me try again.

The purpose of a webarchive is to be an archive of a website!

That means, whatever I get when I download such an archive is a perfectly valid archive of the website!

Even if this is only one line that would then load and create for example a whole forum, or whatever.

I’m fine with this single line - as this is the web archive!

What I am not fine is, when I download such an archive, open it to check what I got and THINK that it contains everything I need - while in fact, as you explained, it only downloaded some code that simple will not work without internet connection and the original site still in place and working.

That means, currently I NEED to disable internet to check what I got within the web archive and in how far it works for me without the original site.

I don’t want to be forced to turn my internet on and off 100 or 1000 times a day!

When I download lots and lots of web archives (which I often do), I would like to directly see in how far the web archive actually works without internet and without the original site, but of course without me being forced to constantly toggle my internet collection.

So what you explained two times is even a valid example for why the feature I was asking for is in dire need!
BECAUSE you otherwise have no idea what your download got and didn’t get.

And now to MY expectations:

Most of the time, my web archives do work fine - the dynamic parts you mention seem to be very rare the type of web pages I want to archive. Even WikiPedia websites get archived quite well, incl. images when shared to DTTG!

And I want web archive for exactly that: Having am an archive of a web page, as it is! But including all HTML and other files, in their original state! This cannot be done with an PDF or any other format, only web archives allow for this.

But again, currently users can only check the quality of their web archive download when toggling the internet connections - this is very very uncomfortable and has several bad implications and problems.

BTW, for what do YOU use web archives for, when you mostly have such dynamic web pages that simply don’t work as web archives as soon as the original web page changed or is gone? For such web pages, web archive are pretty useless and you would better use PDFs! But for my usage, most of the web pages I want to archive work perfectly fine without the original web site, which is the reason why I download web archives from them.

So, again: A toggle to disable any internet connection for opening web archives in DT or DTTG would be much appreciated and increase the usability of both Apps immensely!

And I think it is easily possible to add this - and nobody forces people to use this toggle.

Finally more clear?

No.

I create and download PDFs from most of the web pages I want to archive.

But for many, I want a web archive too!

And as I tried to explain above, I want to archive the web pages! With whatever files get downloaded in the archive … including all comments in the scripts and HTML code and whatever else.

That cannot be done and therefor be replaced by a PDF.
But of course, this does not make the PDF worthless/

The FireShot extension for Firefox does a pretty good job, actually. If it’s just doing a series of screenshots, maybe it’s not the most elegant solution from a technical standpoint. I’m sure it has limitations, but it has worked fine for me. I then store FIreShot’s PDF output in DTP, right next to the DTP-produced web archive, which I see now will be dynamic rather than static. So I have both.

For such usages, I normally use PDFs.
And I have no web pages that would require such screenshot technics.

I just want such stuff:

A real archive of a web page! :slight_smile:

Recommend you use the command “wget”. Lots of examples found by internet search.

1 Like

How boring – that only gives you all the files which any browser on any platform can display. Whereas web archive is a nice, deprecated, binary container that only Safari can display.

1 Like

Yes. The toggle would have to turn the IP connection off before opening the archive and then on again afterward. I wouldn’t hold my breath for that to happen.

Besides, quoting from

In the past it [web archive] has been well-supported by the macOS API, but currently all those existing calls to work with Webarchive files are marked as being deprecated by Apple, so are likely to be removed whenever it wishes, making it impossible to use those deprecated calls in future versions of macOS.

It seems that WKWebView now has a method to create web archives, but not to open them.

2 Likes

I’m a Unix guy, so I do lots of stuff with wget.
But I also run into lots of problems with this …
The main reason not to use it, that I mostly work on an iPad and like to send stuff to DTTG while in the web browser.

wget get’s used for other cases, and when at the Mac

I don’t care, web archives give me everything I need :wink:

Also, I extract them for my usage.

See here: Web archive option is powerful! - #26 by tja

I can open and extract them fine, thanks.

I did not talk about a DT toggle to turn off and on the internet connections for DT!

Only for opening web archives!

Like with opening a HTML with a text editor - this also cannot access the internet.

I suppose that DT relies on Apple’s framework (aka WebKit) to open web archives. The alternative would be to extract all the components and then “somehow” display them while completely avoiding WebKit. So, they’d have to write their own HTML/CSS/JS engines that work only with local files.
Or how would you go about that?

Well, yes. But then you only see the HTML, not a rendered web page. Apples, oranges. Seeing the source of a web archive is probably not a very pleasant experience.

Yes of course, if such a toggle is technically not possible because of the components used, this way to enhance the creation and usage of web archives would sadly be out …

Maybe @BLUEFROG could comment :hugs:

And in conclusion… a simple answer to a simple question is that the concept of a web archive is an oxymoron. The term “archive” in this context applies to data preservation for some possible future need (in this case, in the event that a web page disappears). So the term “web archive” should be retired, as it only has value for static web pages, which are in the minority these days.

The only reasonably useful web archive is, as as pointed out, a PDF version of the page. Easily generated at least in Safari using “File… Export as PDF”.

I’ve given up on web archives. I only want an archive for something that’s important enough to need guaranteed access in the future, and ironically a web archive is about as useful as an ashtray on a motorbike for that purpose.

The term (and the format) isn’t ours, it’s Apple’s.

That’s understood, I wasn’t aiming at DT, but there’s a lot of confusion, and probably a lot of false security around the idea of saving a web archive. (The pitfalls are pointed out in the DT documentation. Unfortunately many (most?) people don’t read documentation.) My point is that web archives are all but useless these days, and the name is misleading.

1 Like

No problem.

I’d say: it depends. There is a huge ocean of content out there, much of which still works fine with web archives, so it depends on what sites you’re dealing with. I think it should be handled on a site-by-site basis unless someone just wants to opt for a singular format (of which there isn’t). In that case, I’d opt for PDF.

Agreed. The problem is, if you’re not technical, but just using the technology as a “civilian” for want of a better term, you can’t tell which is which. At least with a PDF you get what you’re expecting, and if you’re lucky the web archive might be fine. But there’s no guarantee of that. And even if you are technically aware, it takes extra effort to figure out if your web archive is self contained.

1 Like

You see the world very much from your own personal perspective.

For me, WebArchives are the single most important point in DT and the best way to archive web pages!

And most of the time they work perfectly well for ME and the web pages I want to archive.

So please don’t push your personal ideas.