Proof of concept: Merging a mixture of RTF/RTFD/MD files to MD and with links

ngan · April 26, 2020, 7:52pm

My objectives:

I take note in snippets of info (using my stack script) and those notes were saved in RTF formats in the previous years and saved in markdown format since a few months ago. I want to find a way to:

I want to consolidate different snippets in different combinations and in a dynamic manner. And I want to merge the snippets of different file formats into markdown format. DT can only merge files into RTFD when the bundle of files are in different formats.

I want to create a workflow of merged view that mimics the very basic features of Scrivener and Roam Research. It means that I can consolidate the snippets in a single document as well as jumping to edit each section right[almost] from the merged document. I also want to know the tags of each section/snippet and be able to access the content of the tags conveniently. MM6 is not available in DT yet.

Even if MM6 will be available in DT in the future, I still need to convert thousands of RTF/RTFD files into MD format to solve the above problems and will probably lose all the important highlights of keywords and sentences in the process. I am thankful to @pete31 for posting a wonderful script to convert RTF into MD, but I still want to find an alternative that doesn’t require converting thousands of files.

I discovered that the markdown editor of DT has a very interesting feature (at least to me) that will allow me to achieve my objectives without needing to convert any files.

Proof of concept

I select a list of notes in different formats.
Note: The way files are selected is just a refinement in the feature of the script in the future. For example, sections of a writing chapter, items under some keyword tags, or results from a search.

The script will produce this temporary view of merged files, in markdown format, that is saved in a pre-defined group.

Here’s the interesting feature of DT’s markdown editor (at least to me). If we look into the raw-content of this merged-view, we can see that the raw content of some areas is not markdown text. In fact, it is the html source of RTF, but the DT’s markdown editor is still able to preview the html source of RTF/RTFD and retain all links in markdown look and feel.

The problem is, it is very unpleasant to do editing in such a mess of html codes! However, the script has added a [[…]] link at each section, pointing to the original snippet of note.

All I need to do is click the link and edit the original note directly regardless of the file format. And I can return to the merged view by clicking the back button of the document window.

Section that is in MD format:

Section that is in RTF/RTFD format:

At the bottom of each section/snippet, the tags of each snippet are also listed. I can command click on the tags and just keep generating new temporary merged view of different combinations of files by running the script again.

REMINDER

The emphasis of this script is reviewing information has the first priority. Dynamic editing is a complementary feature.
This is really just a proof of concept. I am showing that the current functions of DT can already produce a somewhat dynamic and consolidated view of multiple records.
Dynamic refresh of a merges view is programmable and not too challenging. That will be my next task.
Cons: the formatting of a mixture of different formatted notes will never be as nice looking when comparing to a merged view that is composed of 100% markdown formatted files - but it is time-saving.
Although the script is just a proof of concept, it is already working well for my purpose.

The script: A group with name “MergeView” must be created at the root level for holding the merged files.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

-- by ngan 2020.04.26
--v1b1


property MVGpLocation : "/MergeView"
property MVNameFormat : "YMDHNS"

-- don't change this
property rtfHeader : "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">"

tell application id "DNtp"
	
	set theRecords to selection
	set theRecordCount to length of theRecords
	
	set theContent to ""
	repeat with each in theRecords
		-- set temporary name for dynamic merged view
		set theRecordName to "" & my getDateString(each's modification date, "YMD", ".") & "   " & "[[" & each's name & "]]"
		
		--get the tags of each record and convert them into a string of wikilink in [[]] format
		set eachRecordTags to name of (parents of each whose tag type is ordinary tag)
		repeat with i from 1 to length of eachRecordTags
			set eachRecordTags's item i to "[[" & (eachRecordTags's item i) & "]]  "
		end repeat
		--set eachRecordTags to my listToStr(my sortlist(eachRecordTags), "  ")
		if eachRecordTags is not {} then
			set eachRecordTags to my listToStr(my sortlist(eachRecordTags), "  ")
		else
			set eachRecordTags to ""
		end if
		--prepare the content of the dynamic merged view
		if type of each is in {rtf, rtfd} then
			set theRTFSource to my findAndReplaceInText(source of each, rtfHeader, "")
			set theContent to theContent & theRecordName & return & theRTFSource & return & return & "Tags " & eachRecordTags & return & return & "---" & return & return
		else if type of each is markdown then
			set theContent to theContent & theRecordName & return & return & plain text of each & return & return & return & "Tags " & eachRecordTags & return & return & "---" & return & return & return
		else
			set theRecordCount to theRecordCount - 1
		end if
	end repeat
	
	set theMVName to "MV - " & (my getDateString(current date, MVNameFormat, ".")) & " (" & theRecordCount & " Items)"
	set theNote to create record with {name:theMVName, source:theContent, type:markdown} in (get record at MVGpLocation)
	
	open tab for record theNote
end tell


on getDateString(theDate, theDateFormat, theSeperator)
	tell application id "DNtp"
		local y, m, d, h, n, s, T
		local lol, ds
		set lol to {{"y", ""}, {"m", ""}, {"d", ""}, {"h", ""}, {"n", ""}, {"s", ""}}
		
		set (lol's item 1)'s item 2 to get year of theDate
		set (lol's item 2)'s item 2 to my padNum((get month of theDate as integer) as string, 2)
		set (lol's item 3)'s item 2 to my padNum((get day of theDate) as string, 2)
		set T to every word of (get time string of theDate)
		set (lol's item 4)'s item 2 to T's item 1
		set (lol's item 5)'s item 2 to T's item 2
		set (lol's item 6)'s item 2 to T's item 3
	end tell
	
	set ds to {}
	set theDateFormat to (every character of theDateFormat)
	repeat with each in theDateFormat
		set ds to ds & (my lolLookup(each as string, 1, 2, lol))'s item 2
	end repeat
	
	return my listToStr(ds, theSeperator)
	
end getDateString
on padNum(lngNum, lngDigits)
	-- Credit houthakker
	set strNum to lngNum as string
	set lngGap to (lngDigits - (length of strNum))
	repeat while lngGap > 0
		set strNum to "0" & strNum
		set lngGap to lngGap - 1
	end repeat
	strNum
end padNum
on lolLookup(lookupVal, lookUpPos, getValPos, theList)
	--only for list of list with more than 1 items
	local i, j, k
	set j to lookUpPos
	set k to getValPos
	repeat with i from 1 to length of theList
		if (item j of item i of theList) is equal to lookupVal then return {i, item k of item i of theList, item i of theList}
	end repeat
	return {0, {}, {}}
end lolLookup
on findAndReplaceInText(theText, theSearchString, theReplacementString)
	set AppleScript's text item delimiters to theSearchString
	set theTextItems to every text item of theText
	set AppleScript's text item delimiters to theReplacementString
	set theText to theTextItems as string
	set AppleScript's text item delimiters to ""
	return theText
end findAndReplaceInText
on listToStr(theList, d)
	local thestr
	set {tid, text item delimiters} to {text item delimiters, d}
	set thestr to theList as text
	set text item delimiters to tid
	return thestr
end listToStr
on sortlist(theList)
	set theIndexList to {}
	set theSortedList to {}
	repeat (length of theList) times
		set theLowItem to ""
		repeat with a from 1 to (length of theList)
			if a is not in theIndexList then
				set theCurrentItem to item a of theList as text
				if theLowItem is "" then
					set theLowItem to theCurrentItem
					set theLowItemIndex to a
				else if theCurrentItem comes before theLowItem then
					set theLowItem to theCurrentItem
					set theLowItemIndex to a
				end if
			end if
		end repeat
		set end of theSortedList to theLowItem
		set end of theIndexList to theLowItemIndex
	end repeat
	return theSortedList
end sortlist

pete31 · April 26, 2020, 9:10pm

Want to try but it doesn’t compile

ngan · April 26, 2020, 9:15pm

Hmm… Not sure what’s happening.

I also use SD and the script compiles ok. I just re-paste the script in the first post. There are many rough edges at this stage (my amateur work again!)

OR: the eachRecordTags is {}? My script doesn’t check for empty list… I have updated the script with the check for empty list.

pete31 · April 26, 2020, 9:58pm

It’s a terminology conflict with Satimage scripting addition (Suite “Array and List Utilities”). After commenting out the line that stopped it from compiling I could see that the handler color was different:

Changing the handler name solved that conflict - but if you use Script Debugger its enough to add an underscore at the end of the handler name. Script Debugger then automagically resolves the terminology conflict by putting the name in pipes when you compile the script:

Read about this before but first time that happened.

Very nice script!

ngan · April 26, 2020, 10:00pm

Thanks for the tips!