Script: sync Zotero to DTPro

Hi all, I’ve made a script which takes the JSON output from the BetterBibTeX plugin for Zotero, and use it to build a set of groups inside a Library folder. Each group corresponds to one Zotero reference, and contains at least a summary file generated from a template. If Zotero knows about any attachments, the script indexes the attachment, and if there are any bookmarks, the script creates a DT bookmark.

It’s quite slow to run but seems to trawl through things.

The script should notice changes to existing items, as it keeps a dictionary to map from Zotero item IDs to DT UUIDs for subsequent runs. Originally I had that in a filesystem file, but I now keep it in a DT item (JSON serialised into a plain text file from an NSMutableDictionary).

This relies on some custom metadata fields in DTPro. I was also going to do some clever things with dates but got bored trying to mess around with dates in Applescript. I was also originally going to check the modified date against the previous run of the script and only examine Zotero items with a later modifiedDate, but ran out of interest.

The input file needs to be in BetterBibTeX JSON format, and also needs to be from a pre-released version of the BetterBibTex.

use AppleScript version "2.4"
use scripting additions
use framework "Foundation"

-- all the settings should be here
set bibJSONFile to "/Users/lyndon/repo/workflow/zot-export-bbt.json"
--set thePListFile to "/Users/lyndon/repo/workflow/zotero-to-devonthink.plist"
set theTemplateFile to "/Users/lyndon/Library/Application Support/DEVONthink 3/Templates.noindex/Education/Reference"
set theDTDBFile to "/Users/lyndon/DevonThink/Research.dtBase2"
set theDTLocation to "/Library"

set theModName to "_mod_datetime_zotero.txt"
set theDictionaryName to "_dictionary_zoteroid_uuid.txt"

set ca to current application
property NSJSONSerialization : a reference to current application's NSJSONSerialization
property NSJSONWritingPrettyPrinted : a reference to 1
property NSData : a reference to current application's NSData
property NSString : a reference to current application's NSString

-- a helper method to replace any substring with a space
on remove:remove_string fromString:source_string
	set s_String to NSString's stringWithString:source_string
	set r_String to NSString's stringWithString:remove_string
	return s_String's stringByReplacingOccurrencesOfString:r_String withString:" "
end remove:fromString:

set theJSONData to NSData's dataWithContentsOfFile:(bibJSONFile)
set theJSON to NSJSONSerialization's JSONObjectWithData:theJSONData options:0 |error|:(missing value)
set theCurrentTime to current date

set bibjson to theJSON as record
set therefs to |items| of bibjson

set mainDict to missing value
set theDictRecord to missing value

-- load up the previous data
tell application id "DNtp"
	set theDatabase to open database theDTDBFile
	set theLocation to create location theDTLocation in theDatabase
	-- set mainDict to current application's NSMutableDictionary's dictionaryWithContentsOfFile:thePListFile
	if mainDict is missing value then
		set mainDict to current application's NSMutableDictionary's new()
	end if
	tell theLocation
		set theDictPath to theDTLocation & "/" & theDictionaryName
		set theDictRecord to get record at theDictPath
		if theDictRecord is missing value then
			set theDictRecord to create record with {name:theDictionaryName, type:"txt"} in theLocation
		end if
		if theDictRecord is not missing value then
			set mainDictContents to plain text of theDictRecord
			set mdStr to (ca's NSString's stringWithString:mainDictContents)
			set mdDataDict to item 1 of (ca's NSJSONSerialization's JSONObjectWithData:(mdStr's dataUsingEncoding:(ca's NSUTF8StringEncoding)) options:0 |error|:(reference))
			mainDict's setDictionary:mdDataDict
		end if
	end tell
	--	set theChildGroups to {}
	--	try
	--		set theChildGroups to children of theLocation
	--	end try
	--	repeat with theChild in theChildGroups
	--		if type of theChild is group then
	--			set theZoteroIDString to get custom meta data for "zoteroid" from theChild
	--			set theUUID to uuid of theChild
	--			(mainDict's setObject:theUUID forKey:theZoteroIDString)
	--		end if
	--	end repeat
end tell

set limitCounter to 0
repeat with theRef in therefs
	set limitCounter to limitCounter + 1
	--	if (limitCounter > 20) then exit repeat
	set {theKey, theTitle, theURI, theZoteroID, theDateModified} to {citationKey, title, uri, itemID, dateModified} of theRef
	set theSelectURI to ""
		set theSelectURI to |select| of theRef
	end try
		set theTitle to |shortTitle| of theRef
	end try
	set theZoteroIDString to theZoteroID as string
	set theDOI to ""
		set theDOI to DOI of theRef
	end try
	set theAbstract to ""
		set theAbstract to abstractNote of theRef
	end try
	set theDate to ""
		set theDate to |date| of theRef
		set theReferenceYear to theDate
	end try
	set theURL to ""
		set theURL to |url| of theRef
	end try
	set theCreators to ""
	set multipleCreators to ""
		repeat with theCreator in creators of theRef
				set theCreators to theCreators & multipleCreators & |firstName| of theCreator & " " & |lastName| of theCreator & " (" & |creatorType| of theCreator & ")"
				set multipleCreators to " and "
			end try
		end repeat
	end try
	set theTags to {}
		repeat with theTagItem in tags of theRef
			set theTags to theTags & tag of theTagItem
		end repeat
	end try
	-- construct group name and look for a previous group in the dictionary	
	set theGroupFile to theKey & " " & theTitle
	set theGroupFile to (my remove:"/" fromString:theGroupFile)
	set theUUID to missing value
	set theUUID to (mainDict's objectForKey:theZoteroIDString)
	-- create or update the group
	if theUUID is missing value then
		tell application id "DNtp"
			set theGroup to create location theDTLocation & "/" & theGroupFile
			set theUUID to uuid of theGroup
		end tell
		set theUUID to theUUID as string
		tell application id "DNtp"
			set theGroup to get record with uuid theUUID
			if theGroup is missing value then set theGroup to create location theDTLocation & "/" & theGroupFile
			set the name of theGroup to ("" & theGroupFile)
			set theUUID to uuid of theGroup
		end tell
	end if
	-- create or update a summary file
	set theSummaryName to ("___" & theGroupFile & ".md") as text
	set theSummaryDate to theDate
	set theSummaryPlaceholders to {|%reference%|:theTitle, |%authors%|:theCreators, |%date%|:theSummaryDate, |%citation%|:theKey, |%doi%|:theDOI, |%abstract%|:theAbstract, |%zoteroselect%|:theSelectURI}
	tell application id "DNtp"
		set theTempRecord to import theTemplateFile placeholders theSummaryPlaceholders to theGroup
		set thePrevSummaryName to get custom meta data for "referencesummaryfile" from theGroup
		if thePrevSummaryName is missing value or thePrevSummaryName = "" then
			set theSummaryRecord to get record at (the location of theTempRecord) & theSummaryName
			set theSummaryRecord to get record at (the location of theTempRecord) & thePrevSummaryName
		end if
		if theSummaryRecord is missing value then
			set the name of theTempRecord to theSummaryName
			set theSummaryRecord to theTempRecord
			set theTempContent to the plain text of theTempRecord
			set the plain text of theSummaryRecord to theTempContent
			delete record theTempRecord
			set the name of theSummaryRecord to theSummaryName
		end if
	end tell
	(mainDict's setObject:theUUID forKey:theZoteroIDString)
	tell application id "DNtp"
		tell theGroup
			set aliases to theKey
			set tags to theTags
			set URL to theSelectURI
			set custom meta data to {referencesummaryfile:theSummaryName, DOI:theDOI, abstract:theAbstract, citekey:theKey, zoteroid:theZoteroID}
		end tell
	end tell
	-- add attachments and bookmarks to the group
	if theURL ≠ "" then
		tell application id "DNtp"
			set theBookmarkRecord to lookup records with URL theURL
			if theBookmarkRecord is missing value or (count of theBookmarkRecord) is less than 1 then
				create record with {name:theURL, type:bookmark, URL:theURL} in theGroup
			end if
		end tell
	end if
	set theAttachmentPath to ""
	set attachmentList to {}
		set attachmentList to attachments of theRef
	end try
	repeat with theAttachment in attachmentList
		set theAttachmentPath to ""
		set theAttachmentURI to ""
		set theAttachmentURL to ""
		set theAttachmentLinkMode to ""
			set theAttachmentPath to |path| of theAttachment
		end try
			set theAttachmentURI to uri of theAttachment
		end try
			set theAttachmentURL to |url| of theAttachment
		end try
			set theAttachmentLinkMode to linkMode of theAttachment
		end try
		if theAttachmentPath ≠ "" then
			tell application id "DNtp"
				tell theGroup
					set theAttachmentRecord to lookup records with path theAttachmentPath
					if theAttachmentRecord is missing value or (count of theAttachmentRecord) is less than 1 then
						set theAttachmentRecord to indicate theAttachmentPath to theGroup
						set theAttachmentRecord to item 1 of theAttachmentRecord
					end if
					if theAttachmentRecord is not missing value then
						set custom meta data of theAttachmentRecord to {DOI:theDOI, abstract:theAbstract, citekey:theKey, zoteroid:theZoteroID}
					end if
				end tell
			end tell
		else if theAttachmentLinkMode = "linked_url" then
			tell application id "DNtp"
				set theBookmarkRecord to lookup records with URL theAttachmentURL
				if theBookmarkRecord is missing value or (count of theBookmarkRecord) is less than 1 then
					create record with {name:theAttachmentURL, type:bookmark, URL:theAttachmentURL} in theGroup
				end if
			end tell
		end if
	end repeat
end repeat

set theMainDictJSONData to (NSJSONSerialization's dataWithJSONObject:mainDict options:NSJSONWritingPrettyPrinted |error|:(missing value))
set theMainDictJSONString to (ca's NSString's alloc()'s initWithData:theMainDictJSONData encoding:(ca's NSUTF8StringEncoding))
set theMainDictJSONStringAS to (theMainDictJSONString as text)
tell application id "DNtp"
	tell theGroup
		set plain text of theDictRecord to theMainDictJSONStringAS
		--	(mainDict's writeToFile:thePListFile atomically:true)
	end tell
end tell

BTW this is a very very slow script. The slowness is the main repeat loop iterating over the references. It takes several seconds for each item.

If I run it with a smaller JSON file (e.g. with 100 references), the iteration is much quicker.

I guess this means that the Applescript data structures are very inefficient, but I can’t be bothered refactoring it into (I guess?) NSFoundation ones.

Very interesting - I will give this a try later this week

What do you mean by “pre-released version of the BetterBibTex” ?

The plugin author made some changes so that I had access to enough fields for this script to work (& e.g. guaranteed consistency of the Zotero Item IDs, and added the Zotero select URLs). I’m using a test version of the plugin at the moment, but the changes will be released some time soon.

OK can you let us know when those changes are released please?

sure, will do my best to remember. As I understand it, they’ll be in the next release

If looping and JSON parsing is the bottleneck, you might get better results with Javascript:

Yep I tried that - the problem was that without debugging, it’s sometimes quite tricky to figure out what’s happening with the automation side of things. I could probably port it across now from Applescript (which I am beginning to detest) to Javascript…

I’ve even wondered if I should try using Swift and the automation bridge.

BTW: huge thanks to @retorquere for the work on this - without your plugin Zotero wouldn’t be usable for me. Really appreciate it.

And also thanks to the DT team. Having now got 2,000 items from Zotero into DT with a sensible group structure, abstracts, and PDFS, the search functionality is immediately turning out to be very helpful.

OK I installed the required BBT plugin. I also trie to edit the first 4 settings, though I do not have a “repo” subfolder - am I supposed to create one and I am I suppose to download the .json file from Zotero myself?

I get this error - any help would be appreciated.

Well, you’ll need a folder somewhere for the .json file, which will come from a Zotero export. I have set my export up to automatically export on changes.

OK I see that now regarding the .json export

I think I am missing a template now - what is the format of the required Markdown template?

Here’s mine:

### %reference%


* Date: %date%
* Citekey: %citation%
* DOI: [%doi%](
* Zotero: [%zoteroselect](%zoteroselect)

### Abstract


(Although I see I made a mistake - the Zotero bit should have [%zoteroselect%](%zoteroselect%)

Thanks - OK I have the md file but still I get this - ideas?

No idea, I’ve run out of Applescript knowledge at this point…

Any other dependencies or items to set it up you can think of?

Using Script Debugger it looks like there is an issue with the JSON file format - I know it is there and looks to be in reasonable JSON format.

Never mind… solved that… I was using the wrong BBT format

Looks like I am really close now - the script is running and seems to be creating items as below - but oddly every time it creates a new item the one before it disappears from view - though it is not in the trash so it appears the items are not actually created?

Plus even after I stop the script these changes in the destination Group keep going on?

That doesn’t sound like a script problem to me? Is it your view of the Group?

I know I am looking at the Group correctly - new items with correct titles fleetingly appear but then vanish without actually being created.

So close - but then they disappear.

** Perhaps related - no .plist file is ever created either