Using DEVONthink to generate a blog directly?

mhucka · August 21, 2021, 6:20pm

I have a dream of restarting a blog. The fact that DEVONthink now supports MathJax and citations/references (two critical features for an academic like me) leads to the idea of using DEVONthink as the blogging tool itself: I could organize and write postings in DEVONthink, and use File ▹ Export ▹ as Website… to generate the pages of the blog.

I’m also sure I’m not the first person to think of doing this. Have others among you tried this? How are you doing it? Would you care to share your workflow and tips (and lessons about what not to do)?

chrillek · August 22, 2021, 9:39am

I for one did never think about doing this But that might be due to a lack of imagination or some other limitation on my side.

I had a blog running on WordPress which made me feel uncomfortable (WordPress, that is). After I converted to Hugo (a static website generator), I never looked back. WP is an unwieldy beast with far too many dependencies (and in parts still using jQuery, which always made me shudder). Hugo is unfortunately suffering from a bad documentation and a community that is far less helpfull than the one here.

But it is fast, very easy to customize and make it do what you want. You have (or at least I seem to have) far more control over styles, JavaScript etc that with WP. I use Visual Studio Code to write my MD files, which allows me to integrate seamlessly with Github and work on the same stuff on different machines. Also, to update the site, I can run a git pull on the server followed by a call to hugo and am done. All that via ssh, so safe enough.

I apologize that this was no direct answer to your question but only another point of view. Just PM me if you want to know more about the details.

mhucka · August 22, 2021, 6:38pm

Thanks for your reply and ideas. As it turns out, I have some experience with other options too. I currently use Hugo for a site that I created and maintain for a long-running project (complete with custom theme, custom Hugo shortcuts, custom icons, etc.). I also experimented with Pandoc for a simple note/blogging system that I developed; originally that was partly to create a site to summarize some research done at the time, and partly with an eye towards using it for personal blogging … which I never ended up doing. More recently, I started writing documentation for software projects using a framework developed by the Executable Book Project; this uses a Python-based document generator called Sphinx together with a Markdown flavor called MyST. I’ve found this to have an excellent combination of features, to the point where I’m basically trying to decide whether to use that or figure out a way to generate blog output directly from DEVONthink.

As I’m sure you know, a lot of the options have limitations that one doesn’t encounter until actually trying to use them for real. For instance, Hugo looks nice and powerful, but when you want to customize something, you discover it’s an upward battle to figure out how to do it. After creating Pangolin notebook, I decided in the end that my own simple pandoc-based system is too simple and needs more work to add features that other blogging frameworks provide already – and does the world really need another blogging framework? And finally, JupyterBooks+MyST looks great, but its Markdown syntax for citations and other things is different from the MultiMarkdown syntax that DEVONthink uses, so I can’t simply take documents I write in DEVONthink and use them without modification with MyST or vice versa.

Currently, I put all my notes in DEVONthink. I’d like to share some of what I learn and write, but also want to (1) avoid having to keep two versions of the same content and (2) make the process require as little of my time as possible. That led to the hope that I write everything only once, in DEVONthink, and use that to produce the blog output. After considerable exploration, I’ve narrowed my options to two main approaches:

Use DEVONthink’s ability to convert/export to HTML (including its built-in ability to customize the website template it uses to do that), maybe post-process the results slightly, and use that as the final blog pages.
Use Sphinx/MyST to generate the blog pages, but still keep the written text in DEVONthink by storing content in a folder that is indexed by DEVONthink. This presents two suboptions:

a) Write in MultiMarkdown-flavored markdown and post-process those markdown files to convert them to MyST flavor.

b) Write in MyST-flavored markdown directly, and accept that some constructs won’t be interpretable when viewed in DEVONthink.

Option #1 would produce a less feature-full blog, but would (maybe?) be fastest overall. I’m curious if anyone has tried that and what their experiences are.

meta · January 7, 2023, 7:37pm

I had this idea too.

In an ideal world, I’d keep all my web site content as Markdown in a DEVONthink database. I’d then use some sort of API to synchronize the DEVONthink items into the filesystem, with the metadata from DEVONthink as TOML frontmatter, followed by the body. Then I’d run Hugo to build the actual web site.

It feels like this ought to be possible, but it doesn’t quite seem feasible at present. I can export Markdown files to the filesystem, but the metadata isn’t included (not even as a sidecar file). Maybe I could try scripting? Not sure if it’s possible to fetch documents in reverse date order – I’ve got about 3,000 pages right now, so exporting them all every time through a script probably wouldn’t work very well.

chrillek · January 8, 2023, 11:13am

As much as I like DT, I don’t follow this path. I have my blog/web pages in the filesystem under git control, creating the HTML with Hugo. Also, because the blog hosts a lot of photos that have no point in being in DT at all. And I find VS Code a nicer environment for editing MD than DT. Especially with the plug-ins for spell-checking, linking and such.

Having said that, it should not be too difficult nor too slow to export your MD files into the filesystem while adding front matter with a script. Can you provide a sample of the “before” and “after export” states that shows which meta data you’d want in the front matter?

meta · January 8, 2023, 10:38pm

I don’t think I want to go this route either. A typical Hugo post with TOML frontmatter would look like:

+++
title = "Why I use DEVONthink"
description = "A sub-heading"
date = 2022-12-03T16:47:32-06:00
url = "/2022/12/something/"
category = "Technology"
keywords = ["apple", "macos"]
+++

Delectus consequatur qui unde aut aut tempore aut et. 
Ad aliquid eos ipsam. Expedita ex aut alias aut non 
fugiat consequatur. Perferendis sint sed vel voluptatem 
officiis.

Other than the URL slug, I think it ought to be possible to pull all that info from DEVONthink metadata easily. The last component of the slug could maybe be made from the basename of the path as reported in DEVONthink, downcased and with spaces replaced with dashes?

If the file were placed in a bundle directory, so content/post/2022/12/something/index.md in this example, it would theoretically be possible to pull associated images into the same directory, at the cost of scanning the body of the post. That would make it possible to use DEVONthink as pretty much a complete Hugo content management environment.

I’m still not sure if this is a great idea or a terrible one.

BLUEFROG · January 9, 2023, 2:36am

Only one way for you to find out…

chrillek · January 9, 2023, 11:29am

No. It’s a crutch. Excuse my French. DT is not a content management system, since it even lacks the simplest necessities like automatic versioning (yes, I know that one can do versioning somehow. It’s still not a versioning system).

What’s the point of having your files in DT if you have to pipe them through a script only to have them in Hugo then?
And not only the text files, but the images, too?
What’s the point of trying to use DT’s minimalistic MD editor if you can have a fully functional one with Hugo integration by simply using VS Code?
And as you’ll certainly have noticed, there are real CMS out there supporting Hugo.

No. Not at all.
Having a description that is the first 2nd level heading is a bad idea (SEO wise). The path in DT is mostly meaningless, and it’s last component need not be unique. So grabbing the Hugo folder name from it is a bad idea. Scanning the body of the post to get at the images to then copy them over to the page bundle in Hugo is a bad idea, too: Copying large amounts of data is costly (in the sense of time), and why, oh why, would you want to duplicate the images to another part of your disk?

Regardless, I append the stump of a JavaScript script that illustrates how you might go about what you’re trying to do. It should also illustrate that it’s not a process that will lead to flawless front matter in every case. The script currently does nothing except write the title it found in the MD file(s) to DT’s log. It’s meant to be used in the “Execute script” action of a smart rule. I didn’t bother to code the output part nor the “scan image” stuff. Those are not too complicated, but in my mind the whole exercise is futile. Using VS Code with Git and the plug-ins for MD and Hugo templates should do the trick nicely.

function performsmartrule(records) {
  const app = Application("DEVONthink 3")
  app.includeStandardAdditions = true;
  records.forEach(r => {
    const txt = r.plainText(); // get the MD file's text
    const title = getTitle(r);
    const description = getDescription(r);
    app.logMessage(title)
  })
};


function getTitle(r) {
  // If the text contains a '# title'  line, return everything after '# '
  // Else, if a `title` metadata field exists, return its value
  // Otherwise, return the records name
  const level1heading = r.plainText().match(/^# (.*?)$/m);
  if (level1heading) return level1heading[1];
  const title = r.metaData() && r.metaData()['kMDItemTitle'];
  if (title && title.length) return title;
  return r.name();
}

function getDescription(r) {
  // If the text contains 2nd level heading, return the first 
  // Else, if it contains a 'description' metadata field, return its value
  // Else, if it contains a '<!--more> marker, return everything before it. 
  // Otherwise, return the empty string
  const level2heading = r.plainText().match(/^## (.*?)$/m);
  if (level2heading) return level2heading[1];
  const description = r.metaData() && r.metaData()['kMDItemDescription'];
  if (description && description.length) return description;
  const moreMarker = r.plainText().match(/^(.*)<!--more-->/);
  if (moreMarker) return moreMarker[1];
  return '';
}

meta · January 10, 2023, 1:01am

What’s the point of having your files in DT if you have to pipe them through a script only to have them in Hugo then?

DT provides great search facilities, Hugo provides much more flexibility in turning the content into a web site.

If there was no value in being able to dump out a DEVONthink database as a web site, it wouldn’t have that feature.

What’s the point of trying to use DT’s minimalistic MD editor if you can have a fully functional one with Hugo integration by simply using VS Code?

Well, for starters I can edit anywhere via DEVONthink To Go, and there’s better search. I also think that to count as “Hugo integration” you need more than just syntax highlighting and the ability to run Hugo. Like, a GUI for creating pages that doesn’t require knowing where to save them and how to name them, for example, and a form to fill out instead of editing TOML.

And as you’ll certainly have noticed, there are real CMS out there supporting Hugo.

Well, DatoCMS is €199 a month, CloudCannon is $45 a month, ButterCMS is $83 a month. Kontent is so expensive that they don’t list prices on their web site. Forestry is being shut down, replaced by Tina which only works fully with React-based sites. Strapi would apparently require starting again from scratch. Netlify and Sanity look like they only deploy to their servers. Hokus is still alpha status.

In all seriousness, I’d love to know what options I’m missing that are reasonable for a large personal web site or a tiny business.

chrillek · January 10, 2023, 9:35am

First off, I didn’t really evaluate the usage of a CMS for Hugo – no need for me, given the small number of pages I’m dealing with. So, I can’t comment on that.

“Dumping” DT records as a website is something I never tried, and I have no idea how useful that really is.

Yes, Hugo integration is more than syntax highlighting. But DT does not even have that It also includes the possibility to drag an image on your post to get a usable link or to get autocompletion when typing a link or to use snippets or to have everything in one place: If I mistype a template’s name in VSC, I will immediately see that in Hugo’s output. If I mistype it in DT, I don’t see anything before exporting. After that, I’d have to re-import the changed file into DT… Maybe indexing would be a better approach to avoid round-tripping?

I seem to remember that there was a Hugo CMS plug-in for VSC that attempted to ease creation and management of pages/bundles, but that was far too much for me, too. And too intrusive (I’m more a keyboard person).

Back to DT: I’d suggest that you try your approach with a subset of pages. I.e. write MD in DT and then export it to your Hugo repository with a script. Perhaps it works well enough for you, and I stand corrected.