MultiMarkdown files can have metadata. MultiMarkdown supports a subset of YAML metadata, and does not seem to complain about other YAML. YAML metadata in Markdown documents is used by prominent Markdown-using applications like the Pandoc converter and static site generators like Hugo and Jekyll. GitHub Pages files use it. In other words, there are lots of reasons one might write Markdown with metadata.
For example, one might use the following sort of metadata block (with YAML’s “---” delimiters being optional for MultiMarkdown):
---
title: "Document 2"
date: 2019-11-18T09:58:00-05:00
draft: false
tags: research, notebook
---
# Main Section
body material
Happily, DEVONthink’s MultiMarkdown processing respects such metadata, by hiding it in previews, so it doesn’t clutter up the screen. My question is whether there is a way to expose some or all of these metadata fields to DEVONthink, or for DEVONthink to see it.
An application of this metadata visibility would be the following sort of automation:
A Markdown file is imported to DEVONthink.
Instead of processing the first line as the title (which gives it the title “---”), DEVONthink recognizes that the title value is “Document 2” and assigns that as the title.
Perhaps other fields are processed, or visible for scripting, like date or tags?
I imagine it might be possible to script this by calling a separate YAML processor like yq from within a shell script in a DT smart rule. I’m wondering whether there might be any more direct way to do it within DEVONthink.
FYI, if you don’t use the --- blocks, DT still hides metadata. Also it does recognize some MMD metadata, such as css: some-css-file.
That said I, too, would love to be able to use MMD metadata better within DT. I have fiddled with scripting it and it is fragile, depending on frequent reading and parsing the file even if the headers haven’t changed. I guess that’s really the only way to access this data, but I wonder if it could be better if DT were more aware of it somehow.
-- Use MultiMarkdown metadata for record properties
property theKeys : {"title", "tags"}
tell application id "DNtp"
try
set windowClass to class of window 1
if {viewer window, search window} contains windowClass then
set currentRecord_s to selection of window 1
else if windowClass = document window then
set currentRecord_s to content record of window 1 as list
end if
repeat with thisRecord in currentRecord_s
set theText to plain text of thisRecord
set theValues to {}
repeat with thisKey in theKeys
set lineStart to thisKey & ": " as string
set foundValue to false
repeat with thisLine in paragraphs of theText
if thisLine starts with lineStart then
set end of theValues to my replaceString(thisLine, lineStart, "")
set foundValue to true
exit repeat
end if
end repeat
if foundValue = false then set end of theValues to ""
end repeat
set theName to item 1 of theValues
if theName ≠ "" then set name of thisRecord to theName
set theTags to item 2 of theValues
if theTags ≠ "" then
if theTags contains "," then
set theTagList to my createList(theTags, ",")
else if theTags contains ";" then
set theTagList to my createList(theTags, ";")
else
set theTagList to {theTags}
end if
set tags of thisRecord to (tags of thisRecord) & theTagList
end if
end repeat
on error error_message number error_number
if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
return
end try
end tell
on replaceString(theText, oldString, newString)
local ASTID, theText, oldString, newString, lst
set ASTID to AppleScript's text item delimiters
try
considering case
set AppleScript's text item delimiters to oldString
set lst to every text item of theText
set AppleScript's text item delimiters to newString
set theText to lst as string
end considering
set AppleScript's text item delimiters to ASTID
return theText
on error eMsg number eNum
set AppleScript's text item delimiters to ASTID
error "Can't replaceString: " & eMsg number eNum
end try
end replaceString
on createList(theText, theDelimiter)
set d to AppleScript's text item delimiters
set AppleScript's text item delimiters to theDelimiter
set TextItems to text items of theText
set AppleScript's text item delimiters to d
return TextItems
end createList
@pete31 Incredible work! Thanks very much indeed, as this works beautifully from my toolbar.
Now I am now working on writing a short script that removes surrounding quotation marks (in various formats), should they be present in YAML. YAML fields are often escaped, because colons break them, and authors love colons in titles.
This helps a lot. How would you modify this script in case your list of tags in the markdown header does itself have colons in it? Due to some legacy of an old data management structure that is unfortunately not something I can change. Is there a straight-forward fix?
EDIT: Sorry, my problem was actually a different one (some of my exported metadata used tabs and some used spaces after the colon so I had to account for that). Now all works well.
Removing quotation marks from titles (where they wrap titles in YAML, but aren’t part of the names) just became a lot easier (for me) in version 3.5. It can now be done in a Smart Rule using regular expressions.
I’ve scanned the name for the regular expression ^[\"\'](.+)[\"\']$ which means: “Look for a name with single or double quotes at the beginning and end, and if you find that, capture the text between them.” Then I replace the name with the captured text, which is just \1, or in other words, the first capture-group.
So, this looks like:
One quirk is that renaming an item with @pete31’s Applescript above doesn’t trip the “On Renaming” event in Smart Rules, though I think that would be the most natural trigger. So I’ve used the “On Moving” event to trigger it.
In case you’re trying to run the script I posted in this thread you’ll find that it doesn’t work in DEVONthink 3.6.
That’s due to DEVONthink’s new handling of “invalide arguments”.
After the release of DEVONthink 3 I decided to continue to use “search window” in scripts so that DEVONthink 2 users could use them in, well, search windows. With version 3.6 that’s not possible anymore.
If you want to use the script you’ll have to replace this voluminous block …
set windowClass to class of window 1
if {viewer window, search window} contains windowClass then
set currentRecord_s to selection of window 1
else if windowClass = document window then
set currentRecord_s to content record of window 1 as list
end if
… with this neat line …
set currentRecord_s to selected records
… which does what the six lines have done. Wow, that’s great!
Is there a way to search for documents based on their Markdown front-matter? I have custom front-matter attributes (category and notetype) that I use to keep various “kinds” of notes separated and would (very much) like to be able to search for specific types of notes.
I enabled IndexRawMarkdownSource (using defaults write com.devon-technologies.think3 IndexRawMarkdownSource -bool TRUE) and got almost where I wanted to be. That is, about ⅓ of notes matching a given search showed up. After editing and saving one of the notes that didn’t show up in the search results, and so I thought that perhaps the index wasn’t up-to-date. After rebuilding the database and repeating the search, now all notes matching the search show up.
This is exactly what I was looking for. Thank you (!). I’ve been using DT for years now, and still have many (almost daily) moments where I think to myself, “How did I ever manage without DEVONthink?”. DT (v3, especially) is a superb product, and I, for one, am very grateful for its existence (and for the developers behind it).
I have many (~12K) “Zettlekasten-style” notes that use various custom Markdown front-matter attributes such as ‘category’ and ‘notetype’. For example, I might have a note with the following front-matter:
After enabling IndexRawMarkdownSource (as per previous post), I can now search for notes using queries such as category:howto, notetype:macos, or category:howto notetype:macos.
Note that you may have to rebuild your database/s to get the Markdown front-matter to be fully indexed.
I seem to have run into a bug with DT (Pro, v3.9.4) with indexing done by IndexRawMarkdownSource. I regularly work on two machines (MacBook and iMac). Notes created directly on a particular machine (MacBook, say) are properly indexed based on Markdown front-matter. However, once those same notes sync to the other machine (iMac, in this case), searches based on Markdown front-matter don’t find the new notes.
The only solution I’ve found so far is to rebuild my “Notebox” database on the “other” machine. This “rebuilding” process is becoming tedious and this behaviour is, to my mind, a bug.
I’m willing to work with the DT folks to get to the bottom of this / see a fix provided, if needed.