Problem: capturing content from Safari via Sorter brings up empty file in DT3

These are two different but site-specific issues of the WebKit framework, unfortunately it wasn’t possible so far to work around them. Therefore the best option is to print the page to DEVONthink.

So do I understand correctly that I am just unlucky in the choice of sites? Both FAZ and Guardian did work well back with with DTPO 2 – am I therefore right to assume that something must have changed on the software side in DT3?

Perhaps FAZ and/or Guardian changed something? Web site designers sometimes go over the top with complexity to make it “look good”. Their frequency of change is probably often.

Clipping is basically still the same but it’s of course possible that the sites changed and/or that an update of macOS and its WebKit framework introduced this issue.

I’ve had the same problem since I updated the last time…except the export doesn’t show up as blank in my database, but as a bookmark (web location) regardless of which setting I use, in particular a paginated PDF. Always a bookmark. Driving me bonkers. Any website. I’m still on Mojave using DT3.5.2.

I rebooted just last week. The behavior is the same. I saw something in the thread above that hinted perhaps it is a paywall thing, but I can’t even save https://leananki.com/zettelkasten-method-smart-notes/ as a PDF using the Sorter (goes in as a bookmark).

If you are consistently getting bookmarks captured, reboot again and see if the issues persists.

1 Like

I’m having this problem, have had it for a long time.
I clip loads of stuff from Safari as rich text, via the Sorter.
Usually a rich text version of the page appears in the Inbox. But sometimes the log window pops up and says “bookmark” and in that case, indeed, a bookmark to the page is stored instead. This is pretty random, a repeat of the clip from the same web page usually resuls in an actual richt text document.
Sometimes also an empty file is produced, length is 0 bytes, although it does have a correct title, URL and the tag I gave it. This is also random: using the 'launch URL" function I can go back to the page, repeat the clipping and it will work correctly (usually) the next time.
This is frequent enough that I now have a smart group “Empty” to spot such files and do the chore of clipping them again…
It’s just not working reliably and thus a major irritant.

This might be a network issue or an issue related to dynamic websites. Does a certain URL always cause this? One alternative is to use the Take Rich Note service instead but first you just have to select the interesting part of the webpage. But not all browsers support this service (Safari does). The URL is clipped in this case too.

It seems random, and the very same site will on some subsequent try store correctly. My log now shows dozens of failed (“bookmark”) attempts to save pages from the NYT and Wikipedia, but usually these are no problem.
Because of this, I also have a shortcut-activated rule to convert any bookmarks in my Inbox to rich text, which will usually succeed, proving there’s nothing wrong with those pages. But this doesn’t work on the empty files, that will just give an error message. Which is strange in itself, because in both cases the URL is available. So for those I have to do a ‘launch URL’ and try the whole thing anew, usually succeeding on the second try, but again not always. You can see why this will mess up a work flow…
If it’s a network issue, I’d be inclined to call it a time-out issue… DT should maybe not be so quick to give up?
And finally,when hopeully testing it, the document created by the service “take rich note” did not have a URL attached to it, which is crucial for me.

Which browser do you use? Is DEVONthink allowed to automate the browser? In addition, which version of macOS and DEVONthink do you use?

NewYorkTimes has a lot of programming behind the scenes to deliver HTML to you. When I was a subscriber very little of what they published was easily captured in DEVONthink. As with other mainstream media newspapers, often the best way to capture is to “print” to PDF (or hit the Print button they provide, but that rare nowadays), and “save to DEVONthink”.

I don’t do a lot of saves from Wikipedia, but I see with a small test Clutter Free PDF and Markdown does not preview, but PDF does, as does a “print” to PDF and saving to DEVONThink.

Sometimes, if I really want a web site page, I use DEVON Technologies’ “DEVON Agent Pro” and forwatever reasons sometimes gets it when the Clipper doesn’t

I think the “randomness” you see is due to the diversity of technology running the different web sites. The internet is a complex place.

I appreciate the input, was editing my reply before realizing it was replied to, so repeating a few points here:
NYT and Wikipedia are usually no problem. It’s not the sites. It’s DT.
I also have a shortcut-activated rule to convert any bookmarks this problem makes appear in my Inbox to rich text, which will usually succeed, again proving there’s nothing wrong with those pages.
But this doesn’t work on the empty files, that will just give an error message. Which is strange in itself, because in both cases the URL is available. So for those I have to do a ‘launch URL’ and try the whole thing anew, usually succeeding on the second try, but again not always. You can see why this will mess up a work flow…
If it’s a network issue, I’d be inclined to call it a time-out issue… DT should maybe not be so quick to give up?
And finally,when hopefully testing it, the document created by the service “take rich note” did not have a URL attached to it, which is crucial for me.

Safari. 16.5.1
DT Pro: 3.9.2
Allowed to automate Safari: I don’t know what that means, I get no requests for that and the Sorter often works fine with it. Anything I need to do?

See System Settings > Security & Privacy > Automation > DEVONthink 3

Safari automation is and was activated.

In that case the URL should actually be stored when using services. Does a reboot fix this? If not then a screenshot of System Settings > Security & Privacy > Automation > DEVONthink 3 would be great, thanks.

I wouldn’t be too surprised at this. When browsing the web, you are still connecting to a remote server whose connection isn’t going to be static. As an example, notice how YouTube or Netflix will stall and buffer. Add to this the dynamic content coming from remote(r) servers, sending data into the page you’re viewing. This daisychain of servers and networks isn’t a simple straight pipe from the NYT to your device.

Wouldn’t the same reasoning explain why it’s impossible for a browser to reliably display web pages…?
Anyway, seeing that web browsers can do this, I could work with selecting text and clipping, if a URL came with it (in case I need more).
But that isn’t the case, even after a reboot. Screenshot attached.