Capturing Medium articles to WebArchive

I want to capture some medium articles to DT so that I’ll be able to reference them later. I am a Medium member and some of these articles are for members only.

When I initially capture them to WebArchive, all is good and they can be opened. However, after a period of time, when I open them again, they appear to work just fine and then the page goes to a 500 error “Apologies, but something went wrong on our end.” message. My medium account info is shown on the top right just fine, but the content isn’t there anymore.

I am not enabling “clutter free” when I save them (the format isn’t as nice).

I would have expected the complete page to be rendered from the webarchive itself, but it appears it is still going out to Medium to retrieve info and my guess is that I’m logged into medium with a different session than was active when I captured the article.

Ideas / solutions?

I don’t use Medium but I can imagine that they embed JavaScript in their pages to check credentials. That will probably execute when you open the web archive.

Unfortunately, Medium are one of a growing number of websites who make it difficult to save contents for posterity. Your best bet is to treat the WebArchive as an intermediate step, and use DEVONthink’s conversion tools to make a copy in a different format. You may want to clean up the WebArchive first to remove things like header navigation, sidebars and footers — it’s surprisingly easy, literally select and delete.

3 Likes

In addition: Perhaps the Medium people provide a useful CSS version for printing, so that using “print to PDF” directly from their site might be an option.

1 Like

That is another option, though I’d highly recommend previewing the output first to see if anything gets missed or mangled in the process.

I’m with you on this - and Medium has been a particular sore spot.

Capturing as a web archive. Doing a bit of cleanup in DEVONthink, if needed. Then Data > Convert > To PDF (One Page).

I’ve been having some major problems capturing medium articles with embedded Github snippets. I wanted to try this approach and it seemed to work, but then I didn’t.

So first I captured this article as web archive. When I opened it in DT, it took some noticeable time to load it (mostly the code snippets) but it finally did it. So then I deleted some unneeded stuff (mostly the footer with suggested articles) and converted it to pdf. When I opened the PDF in DT, there were two problems:

  1. All the edits were gone from the PDF, whatever I deleted in web archive was there again
  2. Most of the code snippets were missing from the PDF although they were all present in the web archive version of the article

Do you have any pointers on how to properly save such articles?
Preferably with my edits, but most importantly with code snippets.

A screenshot of the used settings would be useful. In the worst case exporting a PDF (single page) or printing a PDF (paginated) from Safari to DEVONthink should almost always work. Or saving a webarchive in DEVONthink’s inbox folder.