Clip to DEVONThink with clutter-free enable seems broken now

eboehnisch · February 10, 2020, 5:42pm

We have just fixed the issue that the decluttering service returned empty documents. Our web hoster had updated Python to version 3 and the HTML-to-Markdown converter wasn’t compatible to it. We apologize for the inconveniences.

Blanc · February 10, 2020, 7:25pm

Well, that was pretty quick I still think DevonTech is pretty damn good!

kukushi · February 11, 2020, 2:52am

I appreciate this great fix, but there is still one problem: The title is gone.

The top article is clipped months before and the article half is clipped just now.

eboehnisch · February 11, 2020, 11:18am

You’re right, that was another issue. It should be fixed by now.

megob · February 23, 2020, 3:52pm

I’ not sure if this is connected, but since some days web clipper does not generate paginated PDFs any more. If downloads web pages instead. One-Page PDFs do work as before. See screenshot for clippings two different web sources with these two options.

DT 3.0.4; OSX HighSierra 10.13.6 (I know it’s old – Catalina required?); Chrome 80.0.3987.116

I tried with Chrome extension and bookmarklet – result is the same. Using Safari (bookmarklet): Same.

BLUEFROG · February 23, 2020, 3:54pm

It shouldn’t be related to the OS.
If DEVONthink encounters an error it will capture a Bookmark, so as to capture something.
If this gets in a persistent state, reboot the machine and see if it continues or clears up.

megob · February 26, 2020, 2:51pm

Ok, I see – I tried rebooting before, but it didn’t change anything. Now I rebooted once more, and DT’s behaviour seems somewhat inconsistent. Most pages are not saved as PDF, some others are. I tried URLs from sites from which I had captured before, so it should not be linked to the source.

Is it possible that DT captures a bookmark when the response by the source (site, URL) is too slow? Does low speed count as an error? Or is there something that could slow down capturing in DT itself? I get the banner “downloading web pages” (left lower corner of DT) for a long time, a minute or so. It was alway somehow slow, but now it seems really to get stuck. In all cases with this slowed down downloading DT seems to capture a bookmark only. Whereas in the browser, these sites do load like usually, that is, quick.

cgrunenberg · February 26, 2020, 2:56pm

Which URL did you try to capture in which format?

megob · February 26, 2020, 3:04pm

see the screenshot in my posting from 24 Feb pls. I tried Quanta magazine (which I capture regularly and usually without problems), wikipedia, and now, as a test, some german news sites like ZEIT. Always long delays and nearly no PDF capture possible (sometimes unpaginated PDF does work, paginated always worked but not any more). Clutter suppression does not seem to have any influence on the outcome

megob · February 27, 2020, 9:06pm

I still do not manage to get PDF clippings like I got them without problem since DT3 came out. Does anyone have the same problem – or, still better, ideas how to resolve it?

BLUEFROG · February 28, 2020, 5:10am

We are working on some clipping stuff thanks for your patience and understanding.

megob · February 28, 2020, 8:08am

Great, thank you! Looking forward to results / possible updates!

BLUEFROG · February 28, 2020, 3:40pm

This appears to be an issue with PDFKit in High Sierra/Sierra. We would suggest upgrading to Mojave (Catalina, if you’re feeling adventurous) , as the problem doesn’t occur in Mojave or Catalina.

megob · March 1, 2020, 6:45am

Excellent. Thank you @BLUEFROG. I installed Mojave, PDF capturing is back! Normal workflow again finally. Still curious to see what developments there will be concerning clipping, but as for now: problem solved

BLUEFROG · March 1, 2020, 2:02pm

You’re welcome

pslobo · May 20, 2020, 9:04am

I know you’re still working on clipping stuff, but wanted to add that the declutter option seems to break when there are code blocks. It either shows the line numbers and oddly formatted code, or simply omits the code blocks altogether. I’d be happy to provide you with links to either case so you can test.

BLUEFROG · May 20, 2020, 2:46pm

A few URLs would be helpful

pslobo · May 21, 2020, 10:15am

I must have lost the other link that was showing line numbers for code blocks (or then maybe something changed in the meanwhile with the clipper), but here are a couple of links I just tested with Clutter free (both for Markdown and Webarchive). It seems that the biggest issue is almost always with Medium pages.

Working OK

Broken

Most of the Medium pages I tested are broken. Either code blocks are stripped out completely and so are images which are part of the actual, useful content. A few examples are:

https://medium.com/otto-group-data-works/slim-hydrating-cloud-native-ci-cd-pipelines-to-securely-access-gcp-projects-36cf183d1b54 (Images and code missing)
https://medium.com/@m.json/the-kubernetes-cloud-controller-manager-d440af0d2be5 (Images missing, code seems ok)
https://medium.com/google-cloud/kubernetes-engine-gke-multi-cluster-life-cycle-management-series-ee0f583d9b10 (Super broken this one. Picks up some code blocks but only clips about half of the page)
https://www.magalix.com/blog/best-practices-managing-kubernetes-using-terraform (doesn’t pick any of the code blocks).
Actually, this last one is weird. I have 2 versions. 1 clipped today at 11am which picks up code blocks, and another clipped today at 9am which missed all the code blocks, so if you did something between these two times, it may help pinpoint some issue.

Let me know if you need anything further on my end. Will be happy to help troubleshoot this.

pslobo · May 21, 2020, 10:27am

Ok, so I did a little more digging as I was finding it odd that the same page would render different results from the earlier to the later clipping. Then I recalled that my method may have changed so I tested the theory.

When using the chrome extension to clip the page, code blocks are missing, but when using the menubar app to trigger the clipping, the code blocks are picked up. I recorded this to make it easier to demonstrate.

Still haven’t tested the theory on the other “broken” links, but possibly could be a similar case.

RuslanI · January 14, 2022, 7:54am

What is the menu bar app?

I’m having MAJOR problems clipping from Medium into anything besides Web Archive and that is exactly the reason why I subscribed to Medium and partly why I’m using DT. I have not been able to consistently clip from Medium the articles with code blocks no matter what I tried:

Every option in Chrome extension
Every option in DA Pro
Clip from safari

It just would not clip code blocks. Here is the most recent example:

While this example clips as Web Archive, when I convert from Web Archive into PDF inside DT, it strips code blocks again.