BlueSky posts (HTML) not rendering or displaying in DTP

I added this page as a link document with the sorter and included some notes. However in DTP it shows as either blank or as HTML source. I can’t get it to display the actual post.

links not allowed in posts, so removed. Just a link to a Bluesky post on the web.

Thoughts?

Welcome @thor :slight_smile:

What is a “link document”? Please be more specific. What is its Kind?

You have probably not enabled javascript in Preferences > Web.

2 Likes

Thank you!

Sorry, not super familiar with the terminology in DTP. It says “HTML Text” as Kind in the Inspector.

Javascript is enabled in Preferences → Web.

Most documents I put in DTP, regardless of Kind, show a preview or display in the viewer. Web, PDF, text, images, spreadsheets, PowerPoint decks, email, etc. This was the first time I’ve seen a document not display (I can see the raw HTML text if I choose, but that doesn’t help).

It’s not a huge issue, but since I’m still learning DTP I figured it might be something on my end I could change so that it shows up correctly.

Could be that Blue Sky does not want you to make copies like that or has malformed html/css/javascriot.

I don’t know. I just tried capturing a post from Twitter (X) via a browser (HTML), and it displays correctly in DTP.

So it may be something with BlueSky that isn’t working correctly. They say that they are JavaScript heavy, although it is enabled on my system, and it displays correctly in a browser, just not in DTP.

If you’ve saved the HTML to DT, you should be able to open it in Safari. Does that show you the correct HTML? Does the console in the developer tools display any errors?

The interesting parts in a Bluesky HTML look like this:
<script defer="defer" src="/static/js/792.36c6a7b2.js"></script>
Now, what can we expect to happen if a local HTML contains such a line?

Drumroll – nothing. Nada. Zilch. The script is supposed to be located at /static/, and that location does not exist on the local machine. OTOH, if you look at the document when it’s coming from Bluesky, this URL will be resolved to
https://bsky.app/static/js/792.36c6a7b2.js
which is a perfectly fine location – the browser knows how to load that. But the (expletive) programmers over at BS didn’t think to set the base element in their HTML. Stupidity? On purpose?

In any case, no browser has a chance to display this HTML locally. I tried with Safari and FF – the important thing, namely the BS logo, appears, of course. Marketing rulez.

That’s another good illustration why trying to use HTML as a local archiving format is not a good idea.

5 Likes

All good! “Link document” is not a term used anywhere in DEVONthink – and more generally, it is a vague term that could mean many different things :slight_smile: Communication is much easier with precise terms.

I’ve never visited Bluesky and just made a quick reply from my phone this morning… I just gave it a go to see what result I would get, but chrillek beat me to it. He has a much deeper technical understanding than me in this area anyways. I did notice it was a pretty short HTML document that seemed to load content dynamically with a script, but the missing base element didn’t immediately jump out to me.

May I ask why you choose HTML as the capture format? I would probably prefer PDF here.

If HTML is a priority to you, here’s some possible solutions that work on my end:

  1. Load the URL in DEVONthink’s built-in web browser and capture from here as Formatted Note. Either through the menu: Tools > Capture > Formatted Note, or through the Action Menu (the cogwheel) in the Navigation Bar
    • Since the site is dynamic, this ensures the content is loaded in DT before capture. (A bare HTML Page still doesn’t work.)
    • Web Archive is also an option, but this captures much more than you need, and the size difference is substantial. 328,6 KB vs. 8,1 MB (!) for a random example (a post with a link card to an article, including an image, with 12 replies).
  2. Select the relevant part of the page, and use the “DEVONthink 3: Capture Web Archive” System Service.
    • The file size here is much smaller than capturing the whole page as a Web Archive. Formatted Note is still smaller, though – 328,6 KB vs. 533,8 KB.
    • I’m not sure this work’s in other browsers than Safari.
  3. Use the SingleFile browser extension. It might be possible to optimize the configuration for Bluesky, but without adjusting anything I get a file size of 1,6 MB for the same example.

For PDF, I get the best result with Safari’s File > Export as PDF. The potential downside is that it loses a lot of the embedded links (like the timestamped links to individual posts/replies). Printing as PDF keeps most (all?) of the links, but messes up the layout somewhat.

I also recommend reading this post by @chrillek:

1 Like

Hi,

it appears that when I try to preview this specific file in DTP or open in a window, it’s blank and points to a file on disk that won’t seem to load. If I open it in a browser, same problem. If I tell it to open the URL (which is in the URL part of the record in DTP), it works fine because it goes to the actual BlueSky URL rather than the file.

I chose HTML as it was an HTML document and it seemed a good way to capture a web page or post for use later. I’m not tied to using HTML at all, but not sure how PDF would improve it. I went back to the BlueSky post in a browser, used the sorter to grab the URL and this time selected save as PDF. It’s still thinking it over, meaning most likely it failed. Not sure if it’s the website that doesn’t like being printed, or the sorter, or DTP. (Revisiting DTP before hitting send here, it finally did complete, however the resulting file has no content, looks like a “loading” icon on a solid background only. So saving to PDF in this case won’t work).

I have, as mentioned earlier, mainly just left things in the format I found them, so my database has a lot of PDFs, Excel spreadsheets, Word Documents, Keynote presentations, plain and rich text documents, as well as URLs to interesting or relevant articles I come across and want to save to integrate into a presentation, blog post or otherwise use.

Based on what I read in “Take Control of DevonThink3”, this appeared to be the recommended approach. If I should be turning URLs I find reading online sites into PDFs, I can of course do that instead.

This specific BlueSky post had a few images as part of it, which I wanted to ensure get included. I’d rather not use a Web Archive as that would take far too much space.

Thanks.