A dream of a better clutter-free

I am not happy with the current clipper that DT offers.
And worse, I can’t find a better solution. :slight_smile:

Failings

  • Clutter-free cuts pictures away.
  • I have no way to configure the clutter-freeing behaviour to make it match the source site.
  • I have no way to intervene and perform retouches in the process.
  • I believe the current clutter-free does not only happen locally, but uses some server-side processing.

My ideal clutter-free tool and workflow
When I clip a webpage with the sorter…

  1. I am proposed different clutter-free masks (pre-selected, depending on domain).
  2. Then the “masking” (clutter-removal) happens
  3. and I am given the possibility to retouch the resulting intermediate (webarchive) document (like cutting an uninteresting part that the mask did not remove, or resize a picture)
  4. before the document is finally converted to the desired format (for me PDF).

Masks
Central to this ideal, is a locally performed clutter-removal, which the user can configure through masks (a description of what the clutter-free engine has to perform). Masks that I can manage, configure, tweak, or share with the community.

Just a wish. :wink:

Regards,
b.

1 Like

Nearest to your dream is use DEVONSave from DTTG, and then in DT modify the captured HTML and then convert into PDF.

DEVONSave: DEVONsave v3: A shortcut to help you clip articles to clean PDFs in DEVONthink To Go 3 (and DEVONthink 3) - Axle

I have a custom modified self made version that changes some styling and do not convert into PDF in DT automagically but manually.

Thx @rfog, I am aware of it.
What clutter-free algorithm does it use? the DT/DTTG standard one?

Not sure, but from the script “Article to HTML…” it takes Safari sanitization, that AFAIK is the best of all I’ve tested.

There are another options, like Just Read, Print Friendly & PDF, and other complements that allows manually remove elements before generate the PDF.

And not much time ago this thing was discussed here and someone recommended even more complements.

Thx! I am aware of all of this, just not good enough imo.
And this is where DT could mark the difference : serious decluttering possibilities.

1 Like

You could use something like greasemonkey. Cf

That has all kind of possibilities, is completely user configurable etc. You could even use it to inject you own CSS, depending on web site.
Re-inventing that wheel in DT would be a wast of efforts, in my opinion. Especially since a completely user configurable clutter free option would require a huge and complicated UX.

Hello, all.
I am also noticing issues when using the clutter-free option.
I am even observing a case where I am saving a web article that has multiple headings along the text, and all the headings are removed in the clutter-fee PDF.
Evernote’s webclipper browser plugin has options for selecting what part of a webpage to save to Evernote. Perhaps DEVONthink’s clipper would benefit from having something similar.

You can already highlight part of webpage and make it into an RTF or markdown (or plain text) note (it’s on the service menus, or use the short cuts cmd-shift-) or cmd-shift-()

I’ve found that the easiest way to get just the important stuff on a website is to use Safari’s Reader mode, select all the text, then use one of those two shortcuts.

Cmd-shift-r, cmd-a, cmd-shift-), job done…:slight_smile:

This seems to work more consistently and more accurately then any of the other methods, and it has the advantage that you see what you’re getting, but I’ve not run any detailed comparative tests.

You could of course just highlight the text and run the clipper from the share menu if you want to go direct to PDF / tag / label at the same time.

1 Like

Thank you for the advice. I didn’t know about this method.
However, this isn’t ideal for me for a couple of reasons: A) it imports it to a DEVONthink database, whereas I organize my content in file-indexing mode and B) It does not allow me to specify where to save it.
I also found something interesting: the problem I was observing, where multiple headings of sections of a text in a page were being removed in reader mode, is also happening when I view that page using Safari’s Reader.
Has anyone seen this happening with other pages?

Welcome @luisneto

A URL to test could prove helpful.

I agree and I tried to add the URL in my previous post, but I wasn’t allowed as I saw a message saying that I can’t post URLs.
Any chance this restriction can be disabled?

I increased your trust level. Try it now. Thanks!

Thank you! This is the page I am trying to save: https://www.huffpost.com/entry/the-game-of-life_b_6929620.
Can you notice the absence of the section headings in clutter-free view?

Yes, I see the same thing. As to why, I can’t say at the moment.

  1. Safari’s Reader mode shows there’s something beyond out browser extension involved in the situation; something about the way the page is designed.
  2. The browser extension isn’t going to prompt you to save outside of DEVONthink. If you have an indexed group in DEVONthink you could select that in DEVONthink’s group selector.

Thank you for your response.

Is this a response to my comment above where I wrote “A) it imports it to a DEVONthink database, whereas I organize my content in file-indexing mode”?
I am interested in saving it to DEVONthink, but not store it. I want to index it. And the extension allows me to do it.
But what I wrote was in response to @brookter’s suggestion, which was to use a service to save a rich note.
Do you agree my understanding is correct?

If you save into an indexed folder, the file is saved in the external indexed folder and automagically indexed into DT. At least as DT3 works.

And if by any chance it remains inside DT, you easily can send to the indexed folder with the “Move to external folder”.

BTW, using Edge Chromium under macOS, selecting “Immersive reader” and then printing into DT.

12 Rules for the Game of Life.pdf (113.5 KB)

Using the method recommended by @brookter of saving to a Rich Note, is there any way of specifying where to save, that I might not be noticing?

I think no, but not sure. However, I always pass all my stuff into global inbox and then move into wanted folder (indexed or not) via Control + CMD + M shortcut and the start typing the folder name.

Wow, thank you for this. This looks exactly how it should look like!
I just downloaded Microsoft Edge and I was able to produce an identical one.

So, it looks like Microsoft Edge’s Immersive Reader is better than Safari’s and DEVONthink’s clutter-free algorithm, as it didn’t strip out content that is supposed to be kept!