A question re "Summarize Highlights"

I want to ask this question even though I am guessing the answer is probably a “not available in the near future.”

Would DT consider to release the “Summarize Highlights” function as a standard script in the script menu, or to add a function in DT library to get the list of highlighted texts and its page/paragraph links?

We can do a lot of useful stuff by working on those highlighted texts if such a script function is available in the DT dictionary or if we can fork from a standard script.

Thank you in advance

We’ll consider this for future releases. But you could actually already perform a script in a smart rule that is triggered On Creation and uses the conditions…

  • Name ends with Summary
  • Kind is Rich Text
1 Like

Thanks for the advice. But I don’t quite understand this coz there is no script in “Summarized Highlights” to be triggered?

After choosing Tools > Summarize Highlights the smart rule should be triggered and could perform any actions including scripts. However, I just noticed that it doesn’t work, the menu item doesn’t trigger smart rules. 3.0.2 will fix this.

Therefore instead of executing a script to create a summary, the creation of a summary would execute a script.


The major obstacle is that I know of no way (by script or by other ways in DT) to get the highlighted texts block by block in one-go.
Summarize function produces a RTF of all highlighted texts, but if I need to split each block, I’ll need to manually add a separator between each block (the link and the cited text) in that single RTF file before using the “Split Document” function to produce the individual files in which each file is containing one block.

Anyway, this is not a pressing need, just think that it’s good to have the ability to get all blocks, in separate files, in one-go.

You could parse the attribute runs of the text. See e.g. Convert text links to Devonthink links for an example.

Thanks for the tip! I will try experimenting it in the next pet project…

I have a similar suggestion in my recent post which would help this DT users working with individual blocks of text (see Improvements to Summarize Highlights)

The two key suggestions are i) a way to distinguish individual highlights and associated notes (I’ve suggested a double blank line) and ii) the inclusion of the page link for each text block rather than only in cases where the page changes number.

My understanding is that attribute runs in Apple Script parses the text steam into chunks of text with the same properties e.g. highlighted, URLs etc… which can then be used to reshape the file. What would happen with two paragraphs of text (say highlighted text from a PDF) back to back within the summary file. Are these treaded as one chunk or two ?

If you only enumerate the attribute runs, then it’s one block. But you could enumerate the paragraphs first and within each paragraph the attribute runs.

I see the corresponding paragraph Applescript function in the dictionary. I’m assuming that the properties

paragraph spacing (real) : Paragraph spacing of the text.

can be used to delimit the separations between blocks of text. Adding two spaces or other distinction between blocks would help here in the processing of the summary files as per Improvements to Summarize Highlights

It would be interesting to see a script version of Summarize Highlights. I’d love to be able to rearrange some of the elements (e.g. move the line numbers to the end the highlighted text, removing the “line” prefix).

For PDF documents such a script would require third-party apps, neither DEVONthink nor Preview nor Automator support this.

How about for RTF?

I agree that custom formatting is an important feature.

As a stop gap, I threw together a script that converts summaries to my particular format. Editing the outputHighlight subroutine should allow you to customise the format.

As a note, the script supports joining highlights across pages by creating a highlight note with the text “JOIN”

Hope this script starts the conversation towards adding a feature to the software somehow.

Jason Virtue

Convert Summary.pdf (50.8 KB)

The attribute runs of RTF documents are scriptable and could indeed be used to retrieve the highlighted parts.

OK, yes, I see that now, thanks!

So if I want to make a table of contents for a RTF doc that use bolded text for my headings, I can do something like this…

set boldedText to every attribute run whose (font contains "Bold")

…to give me an array of all of the bolded text, nice!

Now I just need to do something with it:

  • get the line number of each bolded text so I can create a link to it
  • create a new RTF doc in the same group as the selected doc.

Does anyone know of sample scripts that do these?

The easiest solution is probably to iterate the paragraphs and then to use the attribute runs of each paragraph.

If converting the function into a script is not possible, one suggestion is to add a dictionary item such as “get highlights” to get a list of highlighted texts within a selection of files (pdf+text and rtf) . However, a record-type list is perhaps more desirable: for each record in the list, the first field is the individual piece of highlighted text the second field is the current page/paragraph URL of that text. This data structure will allow users to script the list for different purposes, such as breaking down the list into multiple rtf files for simple qualitative analysis, or to approximate the function of the wonderful script “annotation and pane v3.0” without needing to worry about the future compatibility.

Just an idea. I don’t think this is a high priority request.

OK, I got curious, and had to see if I could script this! I started out with the idea to re-create the “summarize highlight” command, but make it work for any bolded text (since I use that for my headings). I ended up making a “jump to heading” script instead. It looks for all paragraphs that are bolded, and makes a list of them. You can then select one of the headings and your cursor will jump to it.

Here’s what it looks like running:

It’s a pretty clunky code, and a slow script, so I don’t know how useful it is. But it was fun to figure out! Maybe someone else can do better…?

-- Test script: Jump to headings (bolded text) in document
-- Daniel Sroka, 2019-11-01

tell application id "DNtp"
		set itemURL to get the reference URL of (item 1 of (selection as list))
		tell text of think window 1

			set theList to {}
			set thePos to {}
			set paraCount to 0
			repeat with thisParagraph in every paragraph
				set len to the number of characters of thisParagraph
				if (font of thisParagraph contains "Bold" and len > 1) then --only finds paragraphs that start bold
					if len > 30 then
						set thisHeading to characters 0 thru 30 of thisParagraph
						set thisHeading to thisParagraph
					end if
					set end of theList to thisHeading as text
					set end of thePos to paraCount
				end if
				set paraCount to paraCount + 1
			end repeat
		end tell
		set theChoice to (choose from list theList with prompt {"Choose which heading to jump to: "} default items "" OK button name "Go To") as string
		set choiceNum to my list_position(theChoice, theList)
		set choicePos to item choiceNum of thePos as text
		set itemURL to itemURL & "?reveal=1&line=" & choicePos as string
		open location itemURL
	end try
end tell

on list_position(this_item, this_list)
	repeat with i from 1 to the count of this_list
		if item i of this_list is this_item then return i
	end repeat
	return 0
end list_position

But it was fun to figure out!

This is so much the point of learning AppleScript IMHO.

Maybe someone else can do better…?

Perhaps, but don’t let it diminish your good feelings. If the script does what you want it to do and you feel accomplished… ride the wave, my man!
:slight_smile: :wink: