Get PDF annotations count with AppleScript?

Just looking for a way to use the count of PDF annotations that DEVONthink already maintains in an AppleScript. I can’t see a property of record for this count, however… is it available, or do I have to come up with a way of counting the annotations in the script?

Thanks!

1 Like

Sorry but this is not a property of a record in AppleScript at this time.

1 Like

@BLUEFROG Copy that. Count this as a feature request, then!

My use case: I’m trying to create automatic exports of what I’m calling “reading sessions.” Basically, if I read something and make some annotations, I want to extract just those new annotations to an exported summary.

My ultimate goal is automatically extracting annotation summaries (1) without overwriting any additions/edits made to previously-created summaries, and (2) without re-exporting the same annotations over and over again.

To aid my future self’s memory, and to see if anyone else has other suggestions, this is roughly what I was going to do:

  • Create a custom metadata field for “Previous annotation count.”
  • Create a smart rule that picks up files in a given group that have been modified in the last half-hour, but have not been modified in the last 15 minutes (the assumption is that if I do some highlighting, then leave the document alone for >15 mins, I have probably moved on to something else)
  • Compare the document’s new annotation count with the previous annotation count. if newAnnotationCount > previousAnnotationCount:
    • set addedAnnotationCount to (newAnnotationCount - previousAnnotationCount)
    • summarize highlights in a addedAnnotationsFile markdown record
    • split the newly-created summarized highlights record by \n> such that it creates an array of each annotation
    • get the last addedAnnotationCount annotations from the split-out array and combine them as text as addedAnnotations
    • rewrite the contents of addedAnnotationsFile with just addedAnnotations
    • rename the addedAnnotationsFile to something like %DocumentName% Highlights from Reading Session %yyyyMMddhhmm%, where %DocumentName% and %yyyyMMddhhmm% are variables for the name of the PDF and a timestamp, respectively.

The result of the above pseudocode would be that every time I sit down to highlight a reading, I automatically get a “reading session” annotation summary in a new file about 15 minutes after I put the reading away. This would make it easier to e.g., create tasks in annotations and have them become something I can act on, or to create new link blog post about the reading.

2 Likes

Thanks for sharing your thought process. It’s always interesting to see how people think about such things.

1 Like

I can do better than thought process—got it working!

Edit: an important caveat: this assumes reading and annotating in sequential order. I.e., new annotations will only ever be added to a PDF later in a document. It doesn’t check for new annotations added in between previously-made annotations.

AppleScript:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

property databaseUUID : "D0CA3444-A862-4C99-9A20-3B93E6F24CA8" -- Switch this with the uuid of the database for your summary notes.
property summaryNotesGroupUUID : "E940D2EB-5B4A-4D29-B64F-E585AA756826"
property summaryNotesGroup : "Summary Notes" -- Switch this to the name of the folder/group you'll use to store summary notes when they're first created. Must be a unique name.
property readingSessionNotePrefix : "∎ " -- This is a prefix I use to indicate summary notes. If you don't want to use a prefix, switch it to ""
property readingSessionNoteSuffix : " - Reading Session "

on performSmartRule(theRecords)
	set {year:yr, month:mn, day:dy, hours:hr, minutes:mins} to (current date)
	set dateandtimestamp to "20" & my pad(yr as integer) & my pad(mn as integer) & my pad(dy as integer) & my pad(hr as integer) & my pad(mins as integer) -- Got this from https://macscripter.net/viewtopic.php?id=44567 as a quick and dirty way of getting a Zk-style timestamp. It didn't include the "20" in "2022" so I prepended it manually. That'll become a problem in 87 years or so...
	set datestamp to "20" & my pad(yr as integer) & my pad(mn as integer) & my pad(dy as integer)
	set timestamp to my pad(hr as integer) & ":" & my pad(mins as integer)
	
	tell application id "DNtp"
		set summaryNotesGroup to get record with uuid summaryNotesGroupUUID
		
		repeat with eachRecord in theRecords
			set annotationNoteName to readingSessionNotePrefix & (eachRecord's (name without extension)) & readingSessionNoteSuffix & dateandtimestamp
			
			set highlightsSummary to summarize highlights of records eachRecord as list to markdown in summaryNotesGroup
			
			if highlightsSummary is not missing value then -- highlights were successfully summarized, now we have to clean the resulting syntax
				set highlightsSummaryText to plain text of highlightsSummary
				
				set highlightsArray to my splitText(highlightsSummaryText, ("
* "))
				
				set newAnnotationsCount to ((count of highlightsArray) - 1)
				set previousAnnotationCount to get custom meta data for "Previous annotation count" from eachRecord
				if previousAnnotationCount is missing value then
					set previousAnnotationCount to 0
				end if
				set numberOfNewAnnotations to newAnnotationsCount - previousAnnotationCount
				if (numberOfNewAnnotations ≠ 0) then
					set annotationIterator to 0
					set annotationFileOriginalHeader to the first item in highlightsArray
					set linesOfAnnotationFileHeader to my splitText(annotationFileOriginalHeader, "
")
					set annotationFileYAML to "---" & return & "annotation-status: new" & return & "---" & return & return
					set annotationFileHeader to annotationFileYAML & the first item in linesOfAnnotationFileHeader
					
					set newAnnotations to annotationFileHeader & return & "Reading session from [[" & datestamp & "]] at " & timestamp & return & return
					
					repeat with eachAnnotation in highlightsArray
						if annotationIterator > previousAnnotationCount then
							set newAnnotations to newAnnotations & "
* " & eachAnnotation
						end if
						set annotationIterator to annotationIterator + 1
					end repeat
					set newAnnotations to my replaceText(newAnnotations, "* {==", "- > ")
					set newAnnotations to my replaceText(newAnnotations, return & "* ", return & "- ")
					set newAnnotations to my replaceText(newAnnotations, "==}", "" & return)
					set newAnnotations to my replaceText(newAnnotations, "\\", "")
					set plain text of highlightsSummary to newAnnotations
					
					set name of highlightsSummary to annotationNoteName
					add custom meta data newAnnotationsCount for "Previous Annotation Count" to eachRecord
				else
					-- no new annotations
					delete record highlightsSummary
				end if
				
			end if
		end repeat
	end tell
end performSmartRule

on pad(v) -- got this from https://macscripter.net/viewtopic.php?id=44567
	return text -2 thru -1 of ((v + 100) as text)
end pad

on splitText(theText, theDelimiter)
	set AppleScript's text item delimiters to theDelimiter
	set theTextItems to every text item of theText
	set AppleScript's text item delimiters to ""
	return theTextItems
end splitText

on replaceText(this_text, search_string, replacement_string)
	set prevTIDs to AppleScript's text item delimiters
	set AppleScript's text item delimiters to the search_string
	set the item_list to every text item of this_text
	set AppleScript's text item delimiters to the replacement_string
	set this_text to the item_list as string
	set AppleScript's text item delimiters to prevTIDs
	return this_text
end replaceText

Smart rule:
Screen Shot 2022-03-02 at 2.25.32 PM

A sample of the results:

---
annotation-status: new
---

# [A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research - Koo, Li - 2016.pdf](x-devonthink-item://8926222D-5BE7-4C63-8D0B-458A7A473173)
Reading session from [[20220302]] at 14:27

- > As a rule of thumb, researchers should try to obtain at least 30 heterogeneous samples and involve at least 3 raters whenever possible when conducting a reliability study. Under such conditions, we suggest that ICC values less than 0.5 are indicative of poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.9 indicate good reliability, and values greater than 0.90 indicate excellent reliability.
- [ ] Integrate this rule into the granularity paper

6 Likes

Very nice work :slight_smile:

2 Likes

The next release will add an annotation count property.

2 Likes

Since I built the hacky workaround above, I’m 33% mad, 166% grateful.

And I lost a percentage point somewhere, so let me know if you find it!

8 Likes