Help needed with a script to create RTF files with max width


I have been working on a script (see below) to get Microsoft Word to split a large RTF file into smaller files (splitting the document where it finds a date). Everything works fine except that I would like the RTF files created to fit to the width of the view/window in DT, as happens when you create new RTF files in DT. At present the RTF files have a fixed width which doesn’t fill the space. I am aware that I could manually drag the right margin tab in the view to the right edge in DT to make the text fit the view, but because we are talking about a very large number of RTFs I would rather avoid this.

Ideally there would be a way to script MS Word to change the formatting of the RTF files before saving, but I have not found a way to do so, and it may not be possible.

So then I wondered whether it would be possible to adapt the script so that the new RTFs are created directly in DT, i.e. each section of text identified in the script is copied to the clipboard and a new RTF file is created in DT from the clipboard. Hopefully this way the RTF files would have the formatting I would like.

But I am open to any suggestions from those who are more in the know.



property monthNameList : "JanFebMarAprMayJunJulAugSepOctNovDec"
property isEmptyLineAfterDateLine : true

property destinationFolder : missing value

set sourceFile to choose file

set newDocCount to 0
tell application "Microsoft Word"
	open sourceFile
	set aDoc to active document
	set docName to name of aDoc
	set numberOfParagraphs to count paragraphs of aDoc
end tell
if docName ends with ".rtf" then set docName to text 1 thru -5 of docName
set destinationFolder to ((path to desktop as text) & docName)
do shell script "/bin/mkdir -p " & quoted form of POSIX path of destinationFolder
repeat with i from 1 to numberOfParagraphs
	tell application "Microsoft Word" to set textValue to content of text object of paragraph i of aDoc
	set {dy, mn, yr} to checkDate(textValue)
	if dy is not false then
		if newDocCount is not 0 then
			makeNewDocument(aDoc, low, i - 1, fileName)
		end if
		set low to i + 1 + (isEmptyLineAfterDateLine as integer)
		set newDocCount to newDocCount + 1
		set fileName to yr & "-" & mn & "-" & dy & ".rtf"
	else if dy is false and i = numberOfParagraphs then
		makeNewDocument(aDoc, low, i, fileName)
	end if
end repeat

on makeNewDocument(aDoc, fromParagraph, toParagraph, fName)
	tell application "Microsoft Word"
		set myRange to create range aDoc start (start of content of ¬
			text object of paragraph fromParagraph of aDoc) end (end of content ¬
			of text object of paragraph toParagraph of aDoc)
		select myRange
		copy object selection
		set newDoc to make new document
		paste object text object of newDoc
		save as newDoc file name (destinationFolder & ":" & fName) file format format rtf
		close front document
	end tell
end makeNewDocument

on checkDate(theString)
	set {TID, text item delimiters} to {text item delimiters, space}
		set {dy, mn, yr} to text items of theString
		if (count mn) < 3 or (count yr) < 4 then error
			dy as integer
			set dy to text -2 thru -1 of ("0" & dy)
		on error
		end try
		set monthOffset to offset of (text 1 thru 3 of mn) in monthNameList
		if monthOffset = 0 then
			set mn to text -2 thru -1 of ("0" & (monthOffset div 3) + 1)
		end if
		if (count yr) > 4 then set yr to text 1 thru 4 of yr
			yr as integer
		end try
		set text item delimiters to TID
		return {dy, mn, yr}
	on error
		set text item delimiters to TID
		return {false, false, false}
	end try
end checkDate


I don’t have Word, but you might want to cosy up to “attribute runs” in Applescript:

tell application "DEVONthink Pro"
	get properties of every attribute run of rich text of content 1 of database 1
end tell

There are only a limited number of RTF attributes that Apple/DTPO supports, and they are:

background color
baseline offset
first line head indent
head indent
line spacing
maximum line height
minimum line height
paragraph spacing
properties (ie all of these in record format)
tail indent

You probably want to look at the “indent” properties, which I believe control the margins.

You might have better luck if you open your DOC files in TextEdit and manipulate them from there, and I’d look at Matt Neuberg’s book on Applescript for advice on dealing with RTF scripting.

HTH, Charles

It turns out that you can get the text to flow to fit the width by passing through an HTML phase, and then converting back to RTF.

property pstrTempFile : "tmp.rtf"

if not DT2Running() then
	display dialog "Start DT2"
end if

tell application "Microsoft Word"
	set oDoc to active document
	set refParas to a reference to paragraphs of oDoc
	set lstText to text object of refParas
	set lngParas to length of lstText
	if lngParas < 1 then return
	set strDocName to name of oDoc
	set {oGroup, oWin} to my GetGroupWin()
	tell application "DEVONthink Pro"
		if (count of parents of oGroup) is 0 then
			set oLocn to (create location strDocName in database of oGroup)
			set oLocn to (create location (location of oGroup & "/" & name of oGroup & "/" & strDocName) in database of oGroup)
		end if
	end tell
	tell application "Finder"
		set strTempFolder to (container of (path to me)) as string
		set strTempFile to strTempFolder & pstrTempFile
		set oFile to strTempFile as file specification
		set strPosixPath to POSIX path of strTempFile
	end tell
	repeat with iPara from 1 to lngParas
		set oText to item iPara of lstText
		copy object oText
		set dataRTF to the clipboard as «class RTF »
		-- 		set the clipboard to dataRTF
		-- 		set strRTF to (do shell script "pbpaste -Prefer rtf")
		tell application "Finder"
			open for access oFile with write permission
			write dataRTF to oFile as «class RTF »
			close access oFile
		end tell
		-- (HTML can be easier to read and transform,
		-- and TEXTUTIL can be used again later to convert from HTML back to RTF)
		set strHTML to do shell script "textutil -convert html -stdout " & strPosixPath
		tell application "DEVONthink Pro"
			set oHTML to create record with {type:html, source:strHTML} in oLocn
			set oNewRec to convert record oHTML to rich
			delete record oHTML
			set name of oNewRec to "Para " & iPara as string
		end tell
	end repeat
end tell

on DT2Running()
	tell application id ""
		(count of (processes where creator type = "DNtp")) > 0
	end tell
end DT2Running

on GetGroupWin()
	tell application "DEVONthink Pro"
		set oGroup to missing value
		with timeout of 1 second
				set oGroup to current group
			end try
		end timeout
		on error
			set oGroup to (root of database id 1)
			set oWin to open window for record oGroup
			return {oGroup, oWin}
		end try
		if oGroup is missing value then
			set oGroup to (root of database id 1)
			set oWin to open window for record oGroup
			return {oGroup, oWin}
		end if
		set {oDb, strID} to {database, id} of oGroup
		set lstWins to viewer windows where id of its root is strID and name of its root is name of oDb
		if length of lstWins < 1 then
			set oWin to open window for record oGroup
			set oWin to first item of lstWins
		end if
		{oGroup, oWin}
	end tell
end GetGroupWin

Simply converting an existing RTF record to HTML and back again will achieve the same thing (eliminating the right hand margin, and setting the text to flow to fit).

-- Assuming an existing RTF Record in DT 2 ...

tell application "DEVONthink Pro"
	set oHTMLRec to convert record oRTFRec to html
	set oNewRec to convert record oHTMLRec to rich
	delete record oRTFRec
	delete record oHTMLRec
	set name of oNewRec to "Para " & iPara as string
end tell

Thanks houthakker.

I amended your conversion script to work on a bunch of RTFs in DT. The script also changes the font to Optima 18.

Works a treat

tell application id "com.devon-technologies.thinkpro2"
		set these_items to the selection
		if these_items is {} then error "Please select some contents."
		repeat with this_item in these_items
			set theDate to the modification date of this_item
			set Name_Rec to name of this_item
			set oHTMLRec to convert record this_item to html
			set oNewRec to convert record oHTMLRec to rich
			delete record this_item
			delete record oHTMLRec
			set name of oNewRec to Name_Rec
			set theWin to open window for record oNewRec
			tell text of theWin to set {font, size} to {"Optima", 18}
			close theWin with saving
			set the modification date of oNewRec to theDate
		end repeat
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
	end try
end tell