I have been working on a script (see below) to get Microsoft Word to split a large RTF file into smaller files (splitting the document where it finds a date). Everything works fine except that I would like the RTF files created to fit to the width of the view/window in DT, as happens when you create new RTF files in DT. At present the RTF files have a fixed width which doesn’t fill the space. I am aware that I could manually drag the right margin tab in the view to the right edge in DT to make the text fit the view, but because we are talking about a very large number of RTFs I would rather avoid this.
Ideally there would be a way to script MS Word to change the formatting of the RTF files before saving, but I have not found a way to do so, and it may not be possible.
So then I wondered whether it would be possible to adapt the script so that the new RTFs are created directly in DT, i.e. each section of text identified in the script is copied to the clipboard and a new RTF file is created in DT from the clipboard. Hopefully this way the RTF files would have the formatting I would like.
But I am open to any suggestions from those who are more in the know.
Thanks
Nick
property monthNameList : "JanFebMarAprMayJunJulAugSepOctNovDec"
property isEmptyLineAfterDateLine : true
property destinationFolder : missing value
set sourceFile to choose file
set newDocCount to 0
tell application "Microsoft Word"
open sourceFile
set aDoc to active document
set docName to name of aDoc
set numberOfParagraphs to count paragraphs of aDoc
end tell
if docName ends with ".rtf" then set docName to text 1 thru -5 of docName
set destinationFolder to ((path to desktop as text) & docName)
do shell script "/bin/mkdir -p " & quoted form of POSIX path of destinationFolder
repeat with i from 1 to numberOfParagraphs
tell application "Microsoft Word" to set textValue to content of text object of paragraph i of aDoc
set {dy, mn, yr} to checkDate(textValue)
if dy is not false then
if newDocCount is not 0 then
makeNewDocument(aDoc, low, i - 1, fileName)
end if
set low to i + 1 + (isEmptyLineAfterDateLine as integer)
set newDocCount to newDocCount + 1
set fileName to yr & "-" & mn & "-" & dy & ".rtf"
else if dy is false and i = numberOfParagraphs then
makeNewDocument(aDoc, low, i, fileName)
end if
end repeat
on makeNewDocument(aDoc, fromParagraph, toParagraph, fName)
tell application "Microsoft Word"
set myRange to create range aDoc start (start of content of ¬
text object of paragraph fromParagraph of aDoc) end (end of content ¬
of text object of paragraph toParagraph of aDoc)
select myRange
copy object selection
set newDoc to make new document
paste object text object of newDoc
save as newDoc file name (destinationFolder & ":" & fName) file format format rtf
close front document
end tell
end makeNewDocument
on checkDate(theString)
set {TID, text item delimiters} to {text item delimiters, space}
try
set {dy, mn, yr} to text items of theString
if (count mn) < 3 or (count yr) < 4 then error
try
dy as integer
set dy to text -2 thru -1 of ("0" & dy)
on error
error
end try
set monthOffset to offset of (text 1 thru 3 of mn) in monthNameList
if monthOffset = 0 then
error
else
set mn to text -2 thru -1 of ("0" & (monthOffset div 3) + 1)
end if
if (count yr) > 4 then set yr to text 1 thru 4 of yr
try
yr as integer
end try
set text item delimiters to TID
return {dy, mn, yr}
on error
set text item delimiters to TID
return {false, false, false}
end try
end checkDate
I don’t have Word, but you might want to cosy up to “attribute runs” in Applescript:
tell application "DEVONthink Pro"
get properties of every attribute run of rich text of content 1 of database 1
end tell
There are only a limited number of RTF attributes that Apple/DTPO supports, and they are:
alignment
background color
baseline offset
class
color
first line head indent
font
head indent
line spacing
maximum line height
minimum line height
paragraph spacing
properties (ie all of these in record format)
size
superscript
tail indent
text
underlined
URL
You probably want to look at the “indent” properties, which I believe control the margins.
You might have better luck if you open your DOC files in TextEdit and manipulate them from there, and I’d look at Matt Neuberg’s book on Applescript for advice on dealing with RTF scripting.
It turns out that you can get the text to flow to fit the width by passing through an HTML phase, and then converting back to RTF.
property pstrTempFile : "tmp.rtf"
if not DT2Running() then
display dialog "Start DT2"
return
end if
tell application "Microsoft Word"
set oDoc to active document
set refParas to a reference to paragraphs of oDoc
set lstText to text object of refParas
set lngParas to length of lstText
if lngParas < 1 then return
set strDocName to name of oDoc
set {oGroup, oWin} to my GetGroupWin()
tell application "DEVONthink Pro"
if (count of parents of oGroup) is 0 then
set oLocn to (create location strDocName in database of oGroup)
else
set oLocn to (create location (location of oGroup & "/" & name of oGroup & "/" & strDocName) in database of oGroup)
end if
end tell
tell application "Finder"
set strTempFolder to (container of (path to me)) as string
set strTempFile to strTempFolder & pstrTempFile
set oFile to strTempFile as file specification
set strPosixPath to POSIX path of strTempFile
end tell
repeat with iPara from 1 to lngParas
set oText to item iPara of lstText
copy object oText
set dataRTF to the clipboard as «class RTF »
-- set the clipboard to dataRTF
-- set strRTF to (do shell script "pbpaste -Prefer rtf")
-- WRITE THE RTF OUT TO A TEMPORARY FILE
tell application "Finder"
open for access oFile with write permission
write dataRTF to oFile as «class RTF »
close access oFile
end tell
-- USE TEXTUTIL TO GET AN HTML VERSION OF THE RTF
-- (HTML can be easier to read and transform,
-- and TEXTUTIL can be used again later to convert from HTML back to RTF)
set strHTML to do shell script "textutil -convert html -stdout " & strPosixPath
tell application "DEVONthink Pro"
set oHTML to create record with {type:html, source:strHTML} in oLocn
set oNewRec to convert record oHTML to rich
delete record oHTML
set name of oNewRec to "Para " & iPara as string
end tell
end repeat
end tell
on DT2Running()
tell application id "com.apple.systemevents"
(count of (processes where creator type = "DNtp")) > 0
end tell
end DT2Running
on GetGroupWin()
tell application "DEVONthink Pro"
-- CURRENT GROUP, IF THERE IS ONE
set oGroup to missing value
with timeout of 1 second
try
set oGroup to current group
end try
end timeout
-- ELSE CURRENT DATABASE, IF THERE IS ONE
try
oGroup
on error
set oGroup to (root of database id 1)
set oWin to open window for record oGroup
return {oGroup, oWin}
end try
if oGroup is missing value then
set oGroup to (root of database id 1)
set oWin to open window for record oGroup
return {oGroup, oWin}
end if
-- ENSURE THAT A WINDOW IS OPEN FOR THIS GROUP
set {oDb, strID} to {database, id} of oGroup
set lstWins to viewer windows where id of its root is strID and name of its root is name of oDb
if length of lstWins < 1 then
set oWin to open window for record oGroup
else
set oWin to first item of lstWins
end if
{oGroup, oWin}
end tell
end GetGroupWin
Simply converting an existing RTF record to HTML and back again will achieve the same thing (eliminating the right hand margin, and setting the text to flow to fit).
-- Assuming an existing RTF Record in DT 2 ...
tell application "DEVONthink Pro"
set oHTMLRec to convert record oRTFRec to html
set oNewRec to convert record oHTMLRec to rich
delete record oRTFRec
delete record oHTMLRec
set name of oNewRec to "Para " & iPara as string
end tell
I amended your conversion script to work on a bunch of RTFs in DT. The script also changes the font to Optima 18.
Works a treat
tell application id "com.devon-technologies.thinkpro2"
try
set these_items to the selection
if these_items is {} then error "Please select some contents."
repeat with this_item in these_items
set theDate to the modification date of this_item
set Name_Rec to name of this_item
set oHTMLRec to convert record this_item to html
set oNewRec to convert record oHTMLRec to rich
delete record this_item
delete record oHTMLRec
set name of oNewRec to Name_Rec
set theWin to open window for record oNewRec
tell text of theWin to set {font, size} to {"Optima", 18}
close theWin with saving
set the modification date of oNewRec to theDate
end repeat
on error error_message number error_number
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell