Split-ter for Kindle's "My Clippings" file

The following AppleScript is one I found via this article for use in parsing Kindle’s “My Clippings” text file, which contains highlights, notes, etc…:

tell application "DEVONthink Pro"
	set theSelection to the selection
	if theSelection is {} then error "Please select some contents."
	display dialog "Enter the desired text delimiter (or nothing to break at each paragraph):" default answer "" buttons {"OK"} default button 1
	set SplitPointRegEx to text returned of the result
	if SplitPointRegEx is equal to "" then set SplitPointRegEx to ASCII character 10
	set OldDelimiters to AppleScript's text item delimiters
	repeat with CurrentItem in theSelection
		set AppleScript's text item delimiters to SplitPointRegEx
		set theSource to the plain text of CurrentItem
		set RepeatCount to 0 as integer
		set TotalCount to (count each text item of theSource) as integer
		repeat until RepeatCount is equal to TotalCount
			set RepeatCount to RepeatCount + 1
			set CurrentText to (text item RepeatCount of theSource)
			if length of CurrentText is greater than 0 then
				create record with {name:CurrentText, type:txt, plain text:CurrentText}
			end if
		end repeat
	end repeat
	
	set AppleScript's text item delimiters to OldDelimiters
end tell

The problem with the code is, as you can see, its naming ability. Filenames are enormous (proportional to the size of the highlighted quote, for instance). I know absolutely no Applescript and am wondering if one of you kind people might be able to easily alter the script to name the txt file based on only the first two lines of the “CurrentText” object?

For instance, the highlight


Annals Of the Former World (John McPhee)
- Your Highlight on page 20 | Location 297-299 | Added on Saturday, September 20, 2014 1:41:46 PM

There seemed, indeed, to be more than a little of the humanities in this subject. Geologists communicated in English; and they could name things in a manner that sent shivers through the bones.

would return the filename:


Annals Of the Former World (John McPhee) - Your Highlight on page 20 | Location 297-299 | Added on Saturday, September 20, 2014 1:41:46 PM.txt

Here’s one method

tell application id "DNtp"
	set theSelection to the selection
	if theSelection is {} then error "Please select some contents."
	display dialog "Enter the desired text delimiter (or nothing to break at each paragraph):" default answer "" buttons {"OK"} default button 1
	set SplitPointRegEx to text returned of the result
	if SplitPointRegEx is equal to "" then set SplitPointRegEx to ASCII character 10
	set OldDelimiters to AppleScript's text item delimiters
	repeat with CurrentItem in theSelection
		set AppleScript's text item delimiters to SplitPointRegEx
		set theSource to the plain text of CurrentItem
		set RepeatCount to 0 as integer
		set TotalCount to (count each text item of theSource) as integer
		repeat until RepeatCount is equal to TotalCount
			set RepeatCount to RepeatCount + 1
			set CurrentText to (text item RepeatCount of theSource)
			if length of CurrentText is greater than 0 then
				if length of CurrentText is less than 20 then -- set "20" to whatever you want as max length
					set theTitle to CurrentText
				else
					set theTitle to texts 1 thru 20 of CurrentText
				end if
				create record with {name:theTitle, type:txt, plain text:CurrentText}
			end if
		end repeat
	end repeat
	
	set AppleScript's text item delimiters to OldDelimiters
end tell

The problem I have with this script is that it creates a separate file for each line in the source file – specifically, each block of text terminated by /n (ASCII 10). Perhaps that’s just the case with the sample you posted?

Thanks, korm.

I tried changing the max length to “255” which I believe is the character limit for filenames in OSX so that each file would have a unique name. Unfortunately the clip I provided above resulted in a file with the title “Annals of the Form”, and so did all the other annotations from that book.

I wonder if the problem is that each individual text file produced by the script contains a blank first line? Perhaps the blank line is being counted as a certain number of characters?

Not likely

Did you use 255 in quotes?

That’s a very very very long file name. If nothing else, makes for difficult-to-read tooltips, and probably other problems.

That line is


if length of CurrentText is less than 255 then -- set "20" to whatever you want as max

I altered it to the max length to see how long a filename would have to be to ensure its uniqueness, and for some reason it didn’t alter the length of the filename at all.

Here is what resulted:

Aaargh :blush:

I forgot to tell you to change this term, also


else
    set theTitle to texts 1 thru 20 of CurrentText
end if

Sorry. The result was that the length is hardcoded to 20.

My fault - I should’ve seen that, as well.

Thanks.

I feel like a complete idiot but does this still work? I have no idea what I’m doing and just want to verify that success is possible.