Clip to DEVONThink with clutter-free enable seems broken now

We have just fixed the issue that the decluttering service returned empty documents. Our web hoster had updated Python to version 3 and the HTML-to-Markdown converter wasn’t compatible to it. We apologize for the inconveniences.

7 Likes

Well, that was pretty quick :slight_smile: I still think DevonTech is pretty damn good!

2 Likes

I appreciate this great fix, but there is still one problem: The title is gone.

The top article is clipped months before and the article half is clipped just now.

You’re right, that was another issue. It should be fixed by now.

3 Likes

I’ not sure if this is connected, but since some days web clipper does not generate paginated PDFs any more. If downloads web pages instead. One-Page PDFs do work as before. See screenshot for clippings two different web sources with these two options.

DT 3.0.4; OSX HighSierra 10.13.6 (I know it’s old – Catalina required?); Chrome 80.0.3987.116

I tried with Chrome extension and bookmarklet – result is the same. Using Safari (bookmarklet): Same.

It shouldn’t be related to the OS.
If DEVONthink encounters an error it will capture a Bookmark, so as to capture something.
If this gets in a persistent state, reboot the machine and see if it continues or clears up.

Ok, I see – I tried rebooting before, but it didn’t change anything. Now I rebooted once more, and DT’s behaviour seems somewhat inconsistent. Most pages are not saved as PDF, some others are. I tried URLs from sites from which I had captured before, so it should not be linked to the source.

Is it possible that DT captures a bookmark when the response by the source (site, URL) is too slow? Does low speed count as an error? Or is there something that could slow down capturing in DT itself? I get the banner “downloading web pages” (left lower corner of DT) for a long time, a minute or so. It was alway somehow slow, but now it seems really to get stuck. In all cases with this slowed down downloading DT seems to capture a bookmark only. Whereas in the browser, these sites do load like usually, that is, quick.

Which URL did you try to capture in which format?

see the screenshot in my posting from 24 Feb pls. I tried Quanta magazine (which I capture regularly and usually without problems), wikipedia, and now, as a test, some german news sites like ZEIT. Always long delays and nearly no PDF capture possible (sometimes unpaginated PDF does work, paginated always worked but not any more). Clutter suppression does not seem to have any influence on the outcome

I still do not manage to get PDF clippings like I got them without problem since DT3 came out. Does anyone have the same problem – or, still better, ideas how to resolve it?

We are working on some clipping stuff thanks for your patience and understanding.

Great, thank you! Looking forward to results / possible updates!

This appears to be an issue with PDFKit in High Sierra/Sierra. We would suggest upgrading to Mojave (Catalina, if you’re feeling adventurous) :flushed:, as the problem doesn’t occur in Mojave or Catalina.

Excellent. Thank you @BLUEFROG. I installed Mojave, PDF capturing is back! Normal workflow again finally. Still curious to see what developments there will be concerning clipping, but as for now: problem solved

1 Like

You’re welcome :slight_smile:

I know you’re still working on clipping stuff, but wanted to add that the declutter option seems to break when there are code blocks. It either shows the line numbers and oddly formatted code, or simply omits the code blocks altogether. I’d be happy to provide you with links to either case so you can test.

A few URLs would be helpful :slight_smile:

I must have lost the other link that was showing line numbers for code blocks (or then maybe something changed in the meanwhile with the clipper), but here are a couple of links I just tested with Clutter free (both for Markdown and Webarchive). It seems that the biggest issue is almost always with Medium pages.

Working OK

Broken

Most of the Medium pages I tested are broken. Either code blocks are stripped out completely and so are images which are part of the actual, useful content. A few examples are:

Let me know if you need anything further on my end. Will be happy to help troubleshoot this.

Ok, so I did a little more digging as I was finding it odd that the same page would render different results from the earlier to the later clipping. Then I recalled that my method may have changed so I tested the theory.

When using the chrome extension to clip the page, code blocks are missing, but when using the menubar app to trigger the clipping, the code blocks are picked up. I recorded this to make it easier to demonstrate.

Still haven’t tested the theory on the other “broken” links, but possibly could be a similar case.

What is the menu bar app?

I’m having MAJOR problems clipping from Medium into anything besides Web Archive and that is exactly the reason why I subscribed to Medium and partly why I’m using DT. I have not been able to consistently clip from Medium the articles with code blocks no matter what I tried:

  1. Every option in Chrome extension
  2. Every option in DA Pro
  3. Clip from safari

It just would not clip code blocks. Here is the most recent example:

While this example clips as Web Archive, when I convert from Web Archive into PDF inside DT, it strips code blocks again.