Viewing webarchives in DT3

I think this is a bug with DT3 (although it is possible I have not noticed the correct box to tick).

I save all my bookmarks as webarchives to DT. Many of these webarchives open with a “Please accept cookies” banner.

On DTPO this was easily dealth with: After clicking on “Accept” or “Refuse” (doesn’t matter which, since this is only a webarchive!) the banner disappears, and I can view the page normally. The same behaviour happens on DTTG.

However in DT3, nothing happens when I try and click on either “Accept” or “Refuse”. This means that I cannot get the cookie banner to disappear, and on sites where it occupies the entire page (eg. The Atlantic) this means I cannot view the bookmark at all in DT3.

This happens on all webarchives: clicking the cookie banner does nothing.

We have no control over what webarchives do, but I would suggest adding a Bookmark in DEVONthink and logging in or merely browsing the site in our application. Apps haven’t shared cookies in macOS for a long time.

Consider this url:

https://psmag.com/ideas/most-controversial-tree-in-the-world-gmo-genetic-engineering

If I capture this page as a webarchive in DTTG, it generates a 3.1mb webarchive that works perfectly. Meanwhile if I capture it in DT3, it generates a 2.6mb webarchive with a big cookie banner that obscures the entire page and cannot be clicked away.

(I can’t attach screenshots due to being a new user)

So my question is: can I make DT3 behave like DTTG?

I share your pain.

I can’t give you a one-size fits-all solution as it depends on how the site implements its html & javascript. Sometimes the ‘Clutter-free’ checkbox when you capture works very well, this takes the article content only and not the javascript, but other times you’ll get either a blank page or no images. But using this method you do always get the URL link to the original page which you can right-click ‘Launch URL’ on to try a different capture method when you review the page.

I also find if I capture a webarchive and the text is stuck behind a cookie overlay that I can’t easily delete, right clicking on it and converting it to a Formatted Note will most of the time rip out the javascript and reveal the text, rich text conversion does this too, but can make the text very big for some reason and you can lose images. I’m beginning to actually favour Formatted Notes over Webarchives as the longterm storage format.

Interested in hearing other peoples opinions on which is best actually. I guess it’s what you want from the page. I usually want to save information on particular subjects, so I don’t care about the integrity of page structure.

1 Like

Thanks, these are good tips.

In this particular example, the behaviour of DTTG and DT3 are different. I presume this means that DT3 has a different clipping mechanism than DTTG.

Unfortuantely, at least in testing, the DT3 is much worse than DTTG at getting rid of cookie banners.

@BLUEFROG is there any way to force DT3 to use the same clipping method as DTTG?

@BLUEFROG is there any way to force DT3 to use the same clipping method as DTTG?

No, there’s no way to force anything and yes, there are some different mechanisms at play here.
We will look into this as time allows. Thanks for your patience and understanding.

1 Like