EDIT: For the latest working script and examples of converted files scroll to my latest posts.
Hello everyone!
I have a bash script which converts MD to PDF using pandoc. I also automated it for Path Finder using AppleScript and Keyboard Maestro. But I want to do it right in Devonthink. I’ve seen the script in DTPO, converting files to PDF, but don’t know how to make it work with the shell script I have. Here it is:
$ export PATH=/Library/TeX/texbin:$PATH
$ /usr/local/bin/pandoc "Path to MD file" -s -o "Path for output PDF file" --pdf-engine=xelatex --toc
I’ve uploaded my KM macro, if anyone would like to play with it. You can make it work from Finder or other app. But you need to install pandoc and TeX. Markdown to PDF.kmmacros.zip (1.6 KB)
So, I’d appreciate if anyone could help me make this shell script work with the AppleScript we have for converting to PDF in Devonthink
I would like to incorporate the shell script above into current DEVONthink AppleScript, converting files to PDF (it is in Convert folder of DTPO scripts)
I know how to make it work with Finder, but I need it to work with DTPO directly
To use pandoc, wouldn’t you would have to call Terminal?
Maybe you can work with something along these lines:
set theItem to the selection
set itemPath to path of theItem
set itemName to name of theItem
tell application "Terminal"
set currentTab to do script ("pandoc --wrap=preserve -s " & itemPath & " -o ~/Desktop/" & itemName & ".pdf")
delay 5
do script ("exit") in currentTab
end tell
Using shell script / CURL you could make use of e.g. Docverter somehwere along these lines:
(https://docverter.com/api/)
do shell script "curl POST http://c.docverter.com/convert from=markdown to=pdf input_files[]=@" & itemPath & "-- output " & (name of theItem) & ".pdf && open " & name of thisItem & ".pdf"
DTPO has no idea what to do with the received PDF though. I don’t know how to fetch the file and place it inside the current group. Perphaps you can get the location of the file and use it to construct a path. I’m too unfamiliar with the architecture though, sorry.
set itemLocation to location of thisItem
set parentPath to id of current group
None of this working code but merely some ideas.
Edit:
Using this: how to get the path to a group I reckoned the Terminal approach might work, but in the end I’m promped with some unicode problem.
Finally I did this!
So, If anyone has all these pandoc stuff set up, with its extremally-flexible-to-fine-tune and beautifully-looking PDF output, you may use this script (thanks to Christian Grunenberg for the original script):
-- Convert Markdown documents to Pandoc PDFs (using XeTeX)
-- Created by Christian Grunenberg on Mon Dec 01 2008.
-- Copyright (c) 2008-2011. All rights reserved.
-- Slightly changed by Silverstone on March 18 2019,
-- All copyrights go to great DEVONtech Team ;)
tell application id "com.devon-technologies.thinkpro2"
try
set theSelection to the selection
if theSelection is not {} then
show progress indicator "Converting..." steps (count of theSelection)
set theWindow to missing value
repeat with theRecord in theSelection
set theName to (name of theRecord) as string
step progress indicator theName
if cancelled progress then exit repeat
set theType to type of theRecord
if theType is not group and theType is not feed and theType is not smart group then
if theWindow is missing value then
set theWindow to think window of (open tab for record theRecord)
else
set record of theWindow to theRecord
end if
repeat while loading of theWindow
delay 0.5
end repeat
set Path_to_MD to path of theRecord
-- Setup Your Temporary Folder Here:
set theOutput to "/Users/ilya/Documents/00_Temp/" & theName & ".pdf"
-- Construct your personal command line options here:
do shell script "export PATH=/Library/TeX/texbin:$PATH && /usr/local/bin/pandoc " & Path_to_MD & " -s -o " & theOutput & " --pdf-engine=xelatex --toc"
try
set theParents to parents of theRecord
set thePDF to import theOutput to (item 1 of theParents) name theName
repeat with i from 2 to (count of theParents)
replicate record thePDF to (item i of theParents)
end repeat
set URL of thePDF to URL of theRecord
set creation date of thePDF to creation date of theRecord
set modification date of thePDF to modification date of theRecord
set comment of thePDF to comment of theRecord
set label of thePDF to label of theRecord
end try
tell application "Finder" to delete theOutput as POSIX file
end if
end repeat
if theWindow is not missing value then close theWindow saving no
hide progress indicator
end if
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
Just a few words about tuning:
You need to setup your own Temporary folder (place is marked in the script), which will be used for creating PDFs in the process of conversion. They will be deleted after the import. I just don’t know of the other maybe shorter or more effective ways to do it.
You need to setup Pandoc converter. And create the PDF templates you like (unlimited possibilities: text, titles, graphics, table of contents inside the PDF and as outline, notes, math, bibliography, pagination and all that pro typography stuff). All links and crosslinks are fully preserved. Along with templates, you may use your favourite Pandoc command line options (place is marked in the script).
Script repeats all replicants (if any) of the source MD, as well, as the other important metadata (creation and modification dates, label, URL and comment)
Create fully functional and professionally looking PDFs from your working MDs, right in your groups in one click. A good solution for those who uses heavily in their workflow the nice function of DTPO and DTTG to clip a web with a clutter-free markdown.
Happy experimenting!
PS @cgrunenberg, could you please come and say if all is good with this script. I tested it and it works fine and stable. Just in case I didn’t take into account some deep Devonthink matters.
Loading the document in a window is actually unnecessary in this case as the rendered document isn’t used. In addition, the bundle identifier shouldn’t be used to script DEVONthink so that scripts are compatible to future versions/editions. Here’s a revised script:
-- Convert Markdown documents to Pandoc PDFs (using XeTeX)
-- Created by Christian Grunenberg on Mon Dec 01 2008.
-- Copyright (c) 2008-2011. All rights reserved.
-- Slightly changed by Silverstone on March 18 2019,
-- All copyrights go to great DEVONtech Team ;)
tell application id "DNtp"
try
set theSelection to the selection
if theSelection is not {} then
show progress indicator "Converting..." steps (count of theSelection)
repeat with theRecord in theSelection
set theName to (name of theRecord) as string
step progress indicator theName
if cancelled progress then exit repeat
set theType to type of theRecord
if theType is not group and theType is not feed and theType is not smart group then
set Path_to_MD to path of theRecord
-- Setup Your Temporary Folder Here:
set theOutput to "/Users/ilya/Documents/00_Temp/" & theName & ".pdf"
-- Construct your personal command line options here:
do shell script "export PATH=/Library/TeX/texbin:$PATH && /usr/local/bin/pandoc " & Path_to_MD & " -s -o " & theOutput & " --pdf-engine=xelatex --toc"
try
set theParents to parents of theRecord
set thePDF to import theOutput to (item 1 of theParents) name theName
repeat with i from 2 to (count of theParents)
replicate record thePDF to (item i of theParents)
end repeat
set URL of thePDF to URL of theRecord
set creation date of thePDF to creation date of theRecord
set modification date of thePDF to modification date of theRecord
set comment of thePDF to comment of theRecord
set label of thePDF to label of theRecord
end try
tell application "Finder" to delete theOutput as POSIX file
end if
end repeat
hide progress indicator
end if
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
A little update: now script can handle long filenames with almost any symbols and spaces in them (added quoting in shell script).
New version is here:
-- Convert Markdown documents to Pandoc PDFs (using XeTeX)
-- Created by Christian Grunenberg on Mon Dec 01 2008.
-- Copyright (c) 2008-2011. All rights reserved.
-- Slightly changed by Silverstone on March 18 2019,
-- All copyrights go to great DEVONtech Team ;)
tell application id "DNtp"
try
set theSelection to the selection
if theSelection is not {} then
show progress indicator "Converting..." steps (count of theSelection)
repeat with theRecord in theSelection
set theName to (name of theRecord) as string
step progress indicator theName
if cancelled progress then exit repeat
set theType to type of theRecord
if theType is not group and theType is not feed and theType is not smart group then
set Path_to_MD to path of theRecord
-- Setup Your Temporary Folder Here:
set theOutput to "/Users/ilya/Documents/00_Temp/" & theName & ".pdf"
-- Construct your personal command line options here:
do shell script "export PATH=/Library/TeX/texbin:$PATH && /usr/local/bin/pandoc \"" & Path_to_MD & "\" -s -o \"" & theOutput & "\" --pdf-engine=xelatex --toc"
try
set theParents to parents of theRecord
set thePDF to import theOutput to (item 1 of theParents) name theName
repeat with i from 2 to (count of theParents)
replicate record thePDF to (item i of theParents)
end repeat
set URL of thePDF to URL of theRecord
set creation date of thePDF to creation date of theRecord
set modification date of thePDF to modification date of theRecord
set comment of thePDF to comment of theRecord
set label of thePDF to label of theRecord
end try
tell application "Finder" to delete theOutput as POSIX file
end if
end repeat
hide progress indicator
end if
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
And here are some random PDFs, which are made using this script, after clipping web pages in a clutter-free markdown (without any formatting):
For those who don’t know how/want to use Pandoc (which is great) and would feel more comfortable with a GUI option, I definitely recommend Marked 2 by Brett Terpstra. There are a ton of features including custom CSS files and exporting to various outputs. I’ve used it in tandem with most of my multimarkdown editors for years and it works great with Devonthink via the “Open with” menu item.
If you include your own CSS in Devonthink MD files, then you can simply Print to PDF using the built-in Mac dialog.
A belated thank-you for this, @Silverstone and @cgrunenberg. I appreciate DT3’s built-in document type conversions, but the flexibility of pandoc is also welcome.
I do a lot of writing in Markdown but I have to send Word files to other people, typically with consistent styles. pandoc’s ability to copy Word styles from a reference document is very useful here.
If anyone happens to be playing around with reference documents, I have a query. pandoc’s --data-dir option works for me (it looks for a file named ‘reference.docx’ in a specified folder). A more flexible option is --reference-doc=, where you specify a filename and so you could choose between different sets of styles. The latter option works when I run it from the terminal, but within this script I get an error message about the reference file not being UTF-8. The shell script seems to find the reference file OK, so I don’t think it’s an issue with escaping, POSIX path or whatever. Suggestions appreciated.
PS: I find the escaped pandoc string hard to decipher, especially when you start adding more arguments! A tiny tweak is to build the string first: set myPandoc to "export PATH=..., and then: do shell script myPandoc. The AppleScript variable myPandoc is a bit easier to troubleshoot.
BTW Typora also uses pandoc to export markdown files to various formats, including Word and PDF. So if have this as your default Markdown editor, double-click the file and export.
If I understand you correctly, you want to massage the Markdown file before passing it to Pandoc so that contain’s file references instead of DT3 references?
That might be possible by modifying the plaintext part of the record, finding all the x-devonthink-item references, replacing them by the path in the record the URL points to and then passing the modified text onto pandoc.
Personally, I’d not want to do that in AppleScript, because its string processing sucks. Oh, and while you’re at it: I’m not sure that spaces et al work ok in a filename like the one you mentioned. So maybe you’ll have to URL encode the path while you’re at it.
I am trying to re-implement the script to convert from docx > markdown. Although @Bernardo_V’s script is great it is way too sophisticated for my needs as I just need this one translation and want this to happen automatically on a folder via a smart rule in DT3.
So far I was not successful.
I adjusted the following part:
set theOutput to "/Users/USER/Downloads/" & theName & ".md"
do shell script "/usr/local/bin/pandoc --wrap=none --extract-media=images" & "Path_to_Docx" & "\" -o \"" & theOutput