Convert Evernote Formatted Notes Without Extra Spaces En Masse

I’ve convinced my daughter to import her Ph.D. research materials from Evernote to DEVONthink3 Pro. Now she has thousands of formatted notes in DEVONthink that, when converted to RTF have added, unwanted spaces in her bulleted lists. Is there a fast way to eliminate these lines?

I’ve shown her find and replace for the hidden line return character at the end of every row. That seems too labor-intensive for her workflow.

I’ve searched this forum and others and have seen suggestions of applescripts that must be applied one by one. Is there a way to do this en masse?

Thank you for your help.

One added note: This problem seems to be eliminated if you copy all then paste it into a new RTF note. This seems pretty labor-intensive also but it does seem to work better than converting the Evernote formated notes.

Do you want to replace \n\n with \n in the whole RTF record?

Yes. Ideally in a collection of files.

Please test with duplicates.

-- Replace multiple linefeeds in RTF list with single linefeed

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

tell application id "DNtp"
	try
		set theRecords to selected records
		if theRecords = {} then error "Please select some RTF records."
		
		repeat with thisRecord in theRecords
			set thisRecord_Type to (type of thisRecord) as string
			if thisRecord_Type is in {"rtf", "«constant ****rtf »", "rtfd", "«constant ****rtfd»"} then
				set thisRecord_Path to path of thisRecord
				my regexReplaceRTFD("(?<=\\t)\\p{P}(.*?)(\\n+)(?=\\t)", linefeed, thisRecord_Path)
			end if
		end repeat
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		return
	end try
end tell

on regexReplaceRTFD(theSearchPattern, theReplacementPattern, thePath)
	try
		set theURL to (current application's |NSURL|'s fileURLWithPath:thePath)
		set {theAttributedString, theError} to current application's NSAttributedString's alloc()'s initWithURL:(theURL) options:(missing value) documentAttributes:({NSDocumentTypeDocumentAttribute:(current application's NSRTFDTextDocumentType)}) |error|:(reference)
		if theError ≠ missing value then error (theError's localizedDescription() as string)
		set theMutableAttributedString to theAttributedString's mutableCopy()
		set theMutableAttributedStringChanged to my regexReplaceAttributedString(theMutableAttributedString, theSearchPattern, theReplacementPattern)
		set updateRecord to my writeAttributedString(theMutableAttributedStringChanged, theURL)
	on error error_message number error_number
		activate
		if the error_number is not -128 then display alert "Error: Handler \"regexReplaceRTFD\"" message error_message as warning
		error number -128
	end try
end regexReplaceRTFD

on regexReplaceAttributedString(theMutableAttributedString, thePattern, theReplacementPattern)
	try
		set {theRegex, theError} to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(reference)
		if theRegex = missing value then error theError's localizedDescription() as text
		set theRegex_numberOfCaptureGroups to theRegex's numberOfCaptureGroups()
		set theMutableAttributedString_string to theMutableAttributedString's |string|()
		set theMatches to (theRegex's matchesInString:theMutableAttributedString_string options:0 range:{0, theMutableAttributedString_string's |length|()}) as list
		set theMatches to reverse of theMatches
		repeat with thisMatch in theMatches
			(theMutableAttributedString's replaceCharactersInRange:((thisMatch's rangeAtIndex:theRegex_numberOfCaptureGroups)) withString:theReplacementPattern)
		end repeat
		return theMutableAttributedString
	on error error_message number error_number
		activate
		if the error_number is not -128 then display alert "Error: Handler \"regexReplaceAttributedString\"" message error_message as warning
		error number -128
	end try
end regexReplaceAttributedString

on writeAttributedString(theAttributedString, theURL)
	try
		set theAttributedString_Range to {location:0, |length|:theAttributedString's |length|()}
		if (theAttributedString's containsAttachmentsInRange:theAttributedString_Range) then
			set thisFileWrapper to (theAttributedString's RTFDFileWrapperFromRange:theAttributedString_Range documentAttributes:{NSDocumentTypeDocumentAttribute:(current application's NSRTFDTextDocumentType)})
			set {success, theError} to (thisFileWrapper's writeToURL:theURL options:(current application's NSFileWrapperWritingAtomic) originalContentsURL:(theURL) |error|:(reference))
		else
			set thisData to (theAttributedString's RTFFromRange:(theAttributedString_Range) documentAttributes:{NSDocumentTypeDocumentAttribute:(current application's NSRTFTextDocumentType)})
			set {success, theError} to (thisData's writeToURL:theURL options:(current application's NSDataWritingAtomic) |error|:(reference))
		end if
		if theError ≠ missing value then error (theError's localizedDescription() as string)
		return success
	on error error_message number error_number
		activate
		if the error_number is not -128 then display alert "Error: Handler \"writeAttributedString\"" message error_message as warning
		error number -128
	end try
end writeAttributedString

3 Likes

Thank you Pete!

I’ll send this to her. She’s more programming savvy than I am.

bd

Please tell her that this regex (which is escaped for usage in AppleScript):

which unescaped is this:

(?<=\t)\p{P}(.*?)(\n+)(?=\t)

looks for any punctuation (\p{P}) that follows a tab.

It would be safer to use the actual bullet character she’s using in her bulleted lists, e.g.:

(?<=\t)•(.*?)(\n+)(?=\t)

or

(?<=\t)(•|⁃)(.*?)(\n+)(?=\t)

to also match nested lists. Add each bullet character separated by |

1 Like