Any other software or tips for clipping web content?

Hello everyone,

Is there any other software or additional tips for clipping web content?

It feels like I’m spending more and more time on failed attempts from for example Safari when the content is inside DEVONthink.

When I say failed, I mean that the pages don’t turn out as they should at all - more often now than a few years ago is my feeling.

Thanks in advance,
Pontus.

One reason might be that web sites are built differently now than they were some years ago. Eg, crap like mein was unthinkable before.

1 Like

Yes, I understand that. We’ll see if anyone has any remedy for this.

1 Like

I suggest you search the forum for posts on “clipping”. The issue arises regularly here and various solutions have been suggested already.

Personally, I think it’s a wild goose chase if the goal ist not clearly defined: do you want to save the status quo? Just a reference to the site? Must the layout be preserved? How to you want to handle (potentially) dynamic content? All these are questions one should clarify first.

1 Like

I will echo chrillek: try and search the forum. And if you want help, please provide a bit more information. There is no single approach that works for everything. It depends on the web page/content and what you want to use it for.

I clip most articles and text as markdown at this point. (Small file size, works for different screen sizes, I can adjust the layout to my liking, easy to convert to other formats). Sometimes with DT’s clipper, sometimes with pandoc, sometimes with the MarkDownload browser extension. I’ve heard good things about Obsidians clipper extension, so I’m gonna try that one of the days.

If I want to preserve layout and visual elements, I generally use Safari. Load the page, adjust the window, then File > Export as PDF. (It’s quite similar to DEVONthink’s “PDF: One Page” option, but I find it easier to control in Safari. For some reason DT doesn’t use the exact window size if I capture from the internal browser.) I often crop and compress image-heavy files with PDF Expert.

Occasionally I clip as Web Archive. Either saving a page directly from Safari, using DT’s clipper, or with DT’s “Capture Web Archive” system service after selecting part of a page. I also sometimes use the SingleFile browser extension.

Now and then I save a page as HTML. Firefox has a built-in option to save a complete page including all loaded resources, somewhat similar to the CLI tool wget.

2 Likes

I understand what you mean and have seen this brought up before.

DEVONthink is many things and different for different people. For me, it’s a way to store my files but also a tool for gathering information from the internet (preferably as PDF, when this doesn’t work I can make a clutter-free PDF and also other solutions).

I’ve seen that the note-taking app Bear does this well. I’ll take note that Obsidian is potentially an alternative :blush: (an app I’ve used but no longer do).

I have no knowledge whatsoever in the subject, have no idea how to make the function better - I can only think that a program like DEVONthink should perhaps do these things at the same level or better than the above-mentioned alternatives. Alternatives that in Bear’s case are also presumably yet another expense among all other subscriptions.

Maybe I’ve misunderstood what DEVONthink is, but I imagined that clipping content from the internet would be one of its strengths.

As I said, I’m just a layperson, I have no clue. But I’m a very frequent user of the app and just conveying my experience when using it.

Thanks a lot for your input, I’ll take that with me @troejgaard :slight_smile:

Due to the ever-changing technologies employed by web designers, not all pages can be successfully clipped. You could browse the page in DEVONthink and clip it from there. This may have a better chance of capturing those problematic pages.

PS: There is no “clipping standard”. These things are developed independently and with their own solutions. We are a small development house, completely self-contained, and funded through our sales alone.

The clipping extension will be updated as time allows. Thanks for your patience and understanding.

1 Like

Yes have seen the best results from DEVONthinks own browser!

Thanks for your explanation @BLUEFROG

1 Like

You’re welcome :slight_smile:

PS: Be aware, the version of WebKit in DEVONthink is older and more powerful than the version e.g., in Safari. This also means some pages may not be viewable in DEVONthink if there are incompatibilities (and in worse case scenarios, it could crash). However, there are literally millions of webpages out there that still function as expected. (And switching frameworks is not a simple thing, especially as it could lead to a loss of functionality in our app. Development assesses this from time to time.)

1 Like

Some background:

1 Like

Thanks @BLUEFROG & @chrillek :slight_smile:

1 Like

Have you got examples of where you’ve had problems? I’m asking because I do all my Internet clipping to PDF and I rarely have issues, so I’m wondering where the problem is.

I use a combination of Safari and Firefox, on Mac and iPad, and I use both the DEVONthink sharesheet and Apple’s native print function.

3 Likes

Followup on on @MsLogica’s comment/query, I also do (probably too much) clipping of stuff on the internet and for all but the most awful and complicated web sites, I’d say 99% of the time one or more of the following techniques work just fine for me.

  • DEVONthink’s “Sorter” saving as Markdown (to save much disk space). If I want the graphics included with the stored file, I’ll “convert” the markdown to PDF then use PDFSqueezer to reduce the size.
  • macOS “print to pdf” feature. Usually use Reading view, then print to a PDF file and use PDFSqueezer to reduce that normally too-big file.
  • On this forum someone mentioned the app (in Apple App Store “Mark Download” which I’m tending to use most of the time to make markdown files. I like that the graphics are links not downloads and if I want to keep a copy with graphics included, I again just convert to PDF and handle as described above.

If the above doesn’t work, I simply give up and let the web site “win” in their battle to stop people taking and keeping copies.

3 Likes

When DT is not able to capture the sanitization I want (via save into Inoreader and then have my own Inoreader saved RSS into DT), I use Just Read in Edge. In all three cases (even in Inoreader) I use custom CSS.

You can find here some CSS across the time: Rfog's Pastebin - Pastebin.com

1 Like

This is thanks to the kids being horribly dependant on JavaScript and client-side delirium.

I prefer to capture through DT and then review - this way, at least metadata is collected.

There is a ‘capture as markdown’ extension for Safari, which I use to clip and paste over the text if the DT capture is thwarted, but it does not cleanly handle images. If you’re a bit nerdier, Brett Terpstra has a terminal command to capture markdown from web sources, and tools to integrate it to a Safari bookmarklet, Shortcuts and your clipboard, which also makes it possible to clean up a DT capture.

2 Likes

There are many examples, but I’ll take most major newspapers and Substacks as two such examples. However, I’ve gotten good results with some of the tips from this thread. For example, Safari’s own export as PDF - gives good results in many cases so it’s a good alternative.

So with all the possibilities that have been discussed here, I think a solution exists for most things. A bit cumbersome and more time-consuming than optimal, but good that solutions exist.

I’m taking all the information with me and thank you all! :slight_smile:

4 Likes

I have similar experience to yours - very often clip does not work. Pages are just really broken. I guess web sites like any media, news may be even deliberately designed in a way not to be easily copied.
Implement robust clipping seems huge work so I do not complain much :slight_smile: I understand the challenge :slight_smile:
I’m using Safari own export too. A good thing.
I will try DT own browser too.

Exactly that thought has struck me too, that it’s not directly negative that it’s difficult, intentional or not. DT’s own web browser is what generally works best for me :blush:

That makes me think of another thing - if someone in the community can answer whether there’s any shortcut to more quickly paste the current URL inside DEVONthink? In other words, to avoid x number of clicks.

Thanks in advance!

1 Like

Paste where exactly and to achieve what?

Let’s say I have a URL on my clipboard. Instead of going through the menu in the attached screenshot → tabs → open location with the mouse pointer, is there any way to reach the field where you subsequently paste the URL without these clicks? It would also be nice if you could reach that menu with just a key press even outside of DEVONthink (for someone who works with multiple spaces). Or do I miss something and doing things more complicated than I have too?