Clipping/importing web pages

mjannesh · September 12, 2023, 4:30pm

I want to import web pages into DT. Typically, it is pages belonging to a web magazine. Normally, I use the DT clipper and Crome as the browser.
The import seldom gives a good result, for example, the ‘Accept Cookies’-window is imported, too, although I have clicked it off before the import. At other times, one or two images on the page are imported, but no text, etc. I have tried all available formats.

In another post, I saw that this problem was mentioned and a solution could be to install an add-on: print to DevonThink. The add-on should be located in Install Add-ons. But I cannot find it. Should I look for that add-on elsewhere?

TIA!

BLUEFROG · September 12, 2023, 4:55pm

Welcome @mjannesh

As has been discussed many times on the forums, web clipping is not a 100% bulletproof process, and not only for our clipper.

If you have installed DEVONthink 3 > Install Add-Ons > PDF Services, you can select Save PDF to DEVONthink 3 in a system print dialog.

And in Chrome you’ll have to choose the option to use the system print dialog. Safari is a much better browser for this kind of interaction.

mjannesh · September 12, 2023, 5:22pm

Thank you - it definitely improved the clippings quite a bit!

BLUEFROG · September 12, 2023, 5:44pm

Glad to hear it!

FrankT · September 12, 2023, 7:20pm

I have tried many methods to convert web pages to PDF doc.

What (always) gives me perfect results is, I import the web page with drag and drop (manually) to DT and then make a PDF out of it (in DT).

The PDF looks identical to the web page.

I don’t understand why the clipper doesn’t deliver the same quality, maybe something is different “in the background”. Or is there a specific explanation?

For comparison. Import from Safari “Save PDF to DT”

With drag and drop from chrome

BLUEFROG · September 12, 2023, 10:05pm

Drag and drop from Chrome isn’t comparable to what you produced in Safari. They are not using the same mechanism at all. Safari is printing to PDF via the PDF services. Converting a bookmark to PDF in DEVONthink is not doing the same thing.

PS: It has nothing to do with Chrome. You’d get the same results if you dragged a bookmark from Safari.

FrankT · September 13, 2023, 6:20am

My fault. I did not ask a question.

How do I create a PDF of a web page that looks as much as possible like the web page in one step?

I only found out how to do it in two steps.

BLUEFROG · September 13, 2023, 1:02pm

There is no 100% accurate solution to this. Clipping to PDF without the clutter-free option is one approach. However, it can also yield very long PDFs that may be difficult to view in DEVONthink To Go.

FrankT · September 13, 2023, 2:18pm

Sorry, I don’t quite understand.

When I import a web page with drag and drop and then convert it to a PDF in DT, DT applies a certain “procedure”. So DT is able to do that.

Why is it not possible to apply this “procedure” with the clipper in one step?

jsloop · October 4, 2023, 9:33am

For the most part I use webarchive to save the webpage. This does a fresh capture of the page instead of capturing the existing page on the browser. Means if delete the ads, other promos etc from the html inspector and capture, still those appear in the webarchive. What I do instead is select and remove them from the captured archive. This will remove all the unnecessary elements from the page and keep the archive clean. Some pages don’t look well with web capture. They have dark background and dark font. For example healthline sites. For these I use clutter free option and do minor updates. PDFs of sites doesn’t give the exact look and feel most of the time. I use this sometimes when other options don’t work. When nothing works I convert the capture to plain text.

darwin · October 17, 2023, 10:56am

Did you try “Export as PDF” in Safari?

mjannesh · October 17, 2023, 1:35pm

‘Export as pdf’ in Safari actually works fine. Give it a try.