This script creates TSV records from “Summarize Highlights” of PDF(s) for import into Tinderbox.
It can be used with the “Summarize Highlights”'s Markdown output of one or more selected PDF(s).
What’s included:
- Source PDF name
- Annotation
- Annotation details
- Page
- Page item link
Usage:
- Select one or more PDF(s).
- Menu
Tools > Summarize Highlight > as Markdown
- Select the Markdown record(s)
- Run script
- Drag the TSV(s) to desktop
- Drag the TSV(s) into Tinderbox
- Use stamp 1
How it works:
Stamp 1
Tinderbox doesn’t support new lines inside a TSV field. In order to keep the original layout of an annotation’s details the script replaces new lines with " ---LINEFEED--- ". In Tinderbox this placeholder has to be replaced back into a linefeed. It’s also necessary to replace doubled double quotes. To make things easier the clipboard is automatically set to the necessary stamp code.
To rebuild the layout use stamp 1:
$Name=$Name.replace('\"\"','\"');
$SourcePDF=$SourcePDF.replace('\"\"','\"');
$Annotation=$Annotation.replace('\"\"','\"');
$Details=$Details.replace('\"\"','\"').replace(" ---LINEFEED--- ","\n");
$Text=$Text.replace('\"\"','\"').replace(" ---LINEFEED--- ","\n");
Stamp 2 (optional)
The script supports the creation of TSVs from the “Summarize Highlights” output of one or more selected PDFs. In case you’ve used menu “Summarize Highlights” with more than one selected PDF you’ll get the annotations etc. but they are not grouped into Tinderbox container as it would be if you only had selected one PDF when using “Summarize Highlights” . So moving each annotation in a container with the source PDF’s name might be necessary.
To group annotation notes use stamp 2:
$Container=$Container+$SourcePDF;
Prototype (optional)
In case you want to apply a Prototype to the imported annotation notes:
The import process sets the $KeyAttributes locally for all notes it creates. Thus if you apply a prototype using key attributes to the newly created notes, the Key Attributes will not show in the inheriting notes as the local values unless/until you reset the $KeyAttributes for these notes.
A Tinderbox Reference File - Spreadsheet (Tab-Delimited Text) Import
To reset KeyAttributes use this stamp:
$KeyAttributes=;
If you have questions please don’t hesitate to ask.
-- Create "Summarize Highlights" TSV for import into Tinderbox
-- In DEVONthink: With one or more selected PDF(s): menu "Tools > Summarize Highlight > as Markdown"
-- Select resulting Markdown record(s)
-- Run script
-- Drag resulting TSV(s) into Finder, e.g. desktop
-- In Finder: Drag TSV(s) into Tinderbox
-- In Tinderbox: Use at least the first stamp (it's copied automatically)
(* Use the first stamp to rebuild double quotes and linefeeds:
$Name=$Name.replace('\"\"','\"');
$SourcePDF=$SourcePDF.replace('\"\"','\"');
$Annotation=$Annotation.replace('\"\"','\"');
$Details=$Details.replace('\"\"','\"').replace(" ---LINEFEED--- ","\n");
$Text=$Text.replace('\"\"','\"').replace(" ---LINEFEED--- ","\n");
Optionally: Use the second stamp to group annotation notes:
$Container=$Container+$SourcePDF;
*)
tell application id "DNtp"
try
set theRecords to selected records
if theRecords = {} then error "Please select some \"Summarize Highlights\" Markdown records"
set theOutputGroup to display group selector "Choose output group:"
show progress indicator "Creating TSV... " steps (count theRecords) as string with cancel button
repeat with thisRecord in theRecords
set theType to (type of thisRecord) as string
if theType is in {"markdown", "«constant ****mkdn»"} then
step progress indicator "... " & (name of thisRecord) as string
set theText to plain text of thisRecord
set theParagraphs to paragraphs of theText
set theParagraphs_count to (count theParagraphs)
set the_record to {}
set thisAnnotation_Details to {}
repeat with i from 1 to theParagraphs_count
set thisParagraph to (item i in theParagraphs) as string
set thisParagraph to my trimEnd(thisParagraph)
if thisParagraph begins with "# " then
set thisPDF_UUID to characters ((offset of "](x-devonthink-item://" in thisParagraph) + 22) thru -2 in thisParagraph as string
set thisPDF to (get record with uuid thisPDF_UUID)
set thisPDF_Name to my recordName(name of thisPDF, filename of thisPDF)
if thisPDF_Name contains "/" then set thisPDF_Name to my replaceString(thisPDF_Name, "/", "-")
set finish to false
else if thisParagraph begins with "## [Page " then
set thisPageRefURL to (characters ((offset of "](" in thisParagraph) + 2) thru -2 in thisParagraph) as string
set thisPage to ((((characters ((offset of "?page=" in thisParagraph) + 6) thru -2 in thisParagraph) as string) as integer) + 1) as string
set finish to false
else
if thisParagraph begins with "* " then
set thisAnnotation to (characters 3 thru -1 in thisParagraph) as string
set finish to true
else
set end of thisAnnotation_Details to thisParagraph
set finish to true
end if
end if
if finish = true then
if i > 3 and i < theParagraphs_count then
set lastParagraph to (item (i - 1) in theParagraphs) as string
if lastParagraph does not start with "# " then
set nextParagraph to (item (i + 1) in theParagraphs) as string
if nextParagraph begins with "* " or nextParagraph begins with "## [Page " or nextParagraph begins with "# " then
set end of the_record to {annotation_:thisAnnotation, details_:my tid(thisAnnotation_Details, linefeed), sourcepdf_:thisPDF_Name, page_:thisPage, rurl_:thisPageRefURL}
set thisAnnotation_Details to {}
end if
end if
else if i = theParagraphs_count then
set end of the_record to {annotation_:thisAnnotation, details_:my tid(thisAnnotation_Details, linefeed), sourcepdf_:thisPDF_Name, page_:thisPage, rurl_:thisPageRefURL}
set thisAnnotation_Details to {}
end if
end if
end repeat
set thisRecord_Name to my recordName(name of thisRecord, filename of thisRecord)
if thisRecord_Name contains "/" then set thisRecord_Name to my replaceString(thisRecord_Name, "/", "-")
set theColumns to {"Name", "SourcePDF", "Annotation", "Details", "Pages", "URL", "Text"}
set theTSVRecord to create record with {name:thisRecord_Name, type:sheet, columns:theColumns} in theOutputGroup
set theData_Dummy to {"Delete me", "Source PDF", "Annotation", "Details", "0", "URL", "Text"}
set theCells to cells of theTSVRecord
set end of theCells to theData_Dummy
set cells of theTSVRecord to theCells
repeat with this_record in the_record
set theAnnotation to annotation_ of this_record
set theDetails to details_ of this_record
set theDetails_trimmed to my trimBoth(theDetails)
set theDetails_replaced to my replaceString(theDetails_trimmed, linefeed, " ---LINEFEED--- ")
if theDetails_replaced = "" then
set theText to theAnnotation & " ---LINEFEED--- "
else
set theText to theAnnotation & " ---LINEFEED--- " & " ---LINEFEED--- " & theDetails_replaced
end if
set theSourcePDF to sourcepdf_ of this_record
set thePage to page_ of this_record
set theRefURL to rurl_ of this_record
set this_record_data to {theAnnotation, theSourcePDF, theAnnotation, theDetails_replaced, thePage, theRefURL, theText}
set theCells to cells of theTSVRecord
set end of theCells to this_record_data
set cells of theTSVRecord to theCells
set this_record_data to {}
end repeat
else
error "Please select some \"Summarize Highlights\" Markdown records"
end if
end repeat
open window for record theOutputGroup
activate
set the clipboard to ("$Name=$Name.replace('\\\"\\\"','\\\"');" & linefeed & linefeed & "$SourcePDF=$SourcePDF.replace('\\\"\\\"','\\\"');" & ¬
linefeed & linefeed & "$Annotation=$Annotation.replace('\\\"\\\"','\\\"');" & linefeed & linefeed & "$Details=$Details.replace('\\\"\\\"','\\\"').replace(\" ---LINEFEED--- \",\"\\n\");" & linefeed & linefeed & "$Text=$Text.replace('\\\"\\\"','\\\"').replace(\" ---LINEFEED--- \",\"\\n\");") as string
hide progress indicator
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
return
end try
end tell
on tid(theInput, theDelimiter)
set d to AppleScript's text item delimiters
set AppleScript's text item delimiters to theDelimiter
if class of theInput = text then
set theOutput to text items of theInput
else if class of theInput = list then
set theOutput to theInput as text
end if
set AppleScript's text item delimiters to d
return theOutput
end tid
on trimStart(str)
local str, whiteSpace
try
set str to str as string
set whiteSpace to {character id 10, return, space, tab}
try
repeat while str's first character is in whiteSpace
set str to str's text 2 thru -1
end repeat
return str
on error number -1728
return ""
end try
on error eMsg number eNum
error "Can't trimStart: " & eMsg number eNum
end try
end trimStart
on trimEnd(str)
local str, whiteSpace
try
set str to str as string
set whiteSpace to {character id 10, return, space, tab}
try
repeat while str's last character is in whiteSpace
set str to str's text 1 thru -2
end repeat
return str
on error number -1728
return ""
end try
on error eMsg number eNum
error "Can't trimEnd: " & eMsg number eNum
end try
end trimEnd
on trimBoth(str)
local str
try
return my trimStart(my trimEnd(str))
on error eMsg number eNum
error "Can't trimBoth: " & eMsg number eNum
end try
end trimBoth
on replaceString(theText, oldString, newString)
local ASTID, theText, oldString, newString, lst
set ASTID to AppleScript's text item delimiters
try
considering case
set AppleScript's text item delimiters to oldString
set lst to every text item of theText
set AppleScript's text item delimiters to newString
set theText to lst as string
end considering
set AppleScript's text item delimiters to ASTID
return theText
on error eMsg number eNum
set AppleScript's text item delimiters to ASTID
error "Can't replaceString: " & eMsg number eNum
end try
end replaceString
on recordName(theName, theFilename)
set theSuffix to my getSuffix(theFilename)
if theName ends with theSuffix and theName ≠ theSuffix then set theName to characters 1 thru -((length of theSuffix) + 2) in theName as string
return theName
end recordName
on getSuffix(thePath)
set revPath to reverse of characters in thePath as string
set theSuffix to reverse of characters 1 thru ((offset of "." in revPath) - 1) in revPath as string
end getSuffix