I have several very long documents divided into short notes by a delimiter, such as: ^^^. I’d like to be able to split the document at each delimiter (it would be nice to have the delimiter deleted at the same time). Is there any way to do this in DT3?
A couple of years ago I used Tinderbox’s Explode command. I added a delimiter at the end of each paragraph of a long Word doc with “find and replace” and then imported it into TBX. Now I find TBX to be pretty challenging, but without doubt, Explode was the easiest “slice and dice” in their armory.
Is there any similarly easy way to accomplishing this in DT3? If not, could it be added as a feature request? My sense is that many researchers also work with delimited text – paragraphs, subsections, sections – where such a feature would be appreciated if it’s not yet available.
DEVONthink has no built-in explode command but this is a perfect case for AppleScript's text item delimiters.
This script creates new records (and a version of the source record without delimiters). Don’t know how you’ve set your delimiters so maybe you have to change theDelimiter.
Edit: This script handles only plain text - if you want to split RTF(D) text use this instead.
-- Explode text into new text records (and create version of source text without delimiters)
property theDelimiter : linefeed & linefeed & "^^^" & linefeed & linefeed
tell application id "DNtp"
try
set windowClass to class of window 1
if {viewer window, search window} contains windowClass then
set currentRecord_s to selection of window 1
else if windowClass = document window then
set currentRecord_s to content record of window 1 as list
end if
set theRecord to item 1 of currentRecord_s
set theText to plain text of theRecord
set d to AppleScript's text item delimiters
set AppleScript's text item delimiters to theDelimiter
set TextItems to text items of theText
set AppleScript's text item delimiters to d -- always set them back
set theGroup to (parent 1 of theRecord)
repeat with thisTextItem in TextItems
repeat with thisParagraph in (paragraphs of thisTextItem)
if thisParagraph ≠ "" then
set theName to thisParagraph
end if
exit repeat
end repeat
set thisRecord to create record with {name:theName, plain text:thisTextItem, type:text} in theGroup
end repeat
set theTextWithoutDelimiters to my string_From_List(TextItems, linefeed)
set recordWithoutDelimiters to create record with {name:(name of theRecord & " (without Delimiters)"), plain text:theTextWithoutDelimiters, type:text} in theGroup
on error error_message number error_number
if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
return
end try
end tell
on string_From_List(theList, theDelimiter)
set theString to ""
set theCount to 0
repeat with thisItem in theList
set theCount to theCount + 1
set thisItem to thisItem as string
if theCount ≠ (count of theList) then
set theString to theString & thisItem & theDelimiter
else
set theString to theString & thisItem
end if
end repeat
return theString
end string_From_List
This script is a good, if not the best example, why I probably never will understand why some people are so fond of Apple Script. Other programming languages have a simple Split command with two variables: the source text and the delimiter. One line of code and you got an array of items.
*
The script above handles only plain text which might be a problem.
*
Linn, you might do a search for “Kindle” or “Kindle Clippings” in this forum. Amazon’s Kindle creates a txt file out of highlighted text snippets and comments and divides them with delimiters. And because of that there are a number of topics here about splitting such files into single items which might be useful for you too. (But again: They handle just plain text.)
*
Another recommendation: Leave Word if you can and switch to Scrivener. Not only is it the writing tool for anything more complex than just a letter and has a focus on writers with lots of research materials.
Also Scrivener’s ‘chapters’ (or whatever the single parts of the document are) by nature are single files and therefore can be easily exported to, say, DEVONthink. For the opposite direction of moving in and splitting a document that already has delimiters Scrivener has a “Import and Split” feature which does exactly what its names says, and it is simple to use. Both directions do work with rich text too.
By the way, I am not suggesting to replace DEVONthink by Scrivener. While there is some overlap in functions they are two different kinds of beasts that complement each other really well.
I ran this script about one year ago for a summary of about 500 abstracts . Just tried it with a small test file and it still works for rtf and text file - except for images in the file.
Many thanks for the help with the delimiter problem. I learned more than I anticipated from your replies. I’m not a programmer, so @pete31’s short Applescript was interesting for learning how that problem could be solved in code. The back and forth with @ngan encourages me to give it a try next time.
@suavito, thanks for the suggestion to search here for Kindle clippings. It hadn’t occurred to me to export them into DT3 – it will encourage me to do more professional reading in Kindle. As for Scrivener, I use it until I get what I’m really writing about, and then finish off in Word, which is also required by many publishers. I knew about making a split or two in a Scrivener document, but not about the Import and Split – which immediately solved the problem and allowed me to pull out the pieces that I needed in DT3. Thanks!
In case you’re trying to run the script I posted in this thread you’ll find that it doesn’t work in DEVONthink 3.6. That’s due to DEVONthink’s new handling of “invalide arguments”.
After the release of DEVONthink 3 I decided to continue to use “search window” in scripts so that DEVONthink 2 users could use them in, well, search windows. With version 3.6 that’s not possible anymore.
If you want to use the script you’ll have to replace this voluminous block …
set windowClass to class of window 1
if {viewer window, search window} contains windowClass then
set currentRecord_s to selection of window 1
else if windowClass = document window then
set currentRecord_s to content record of window 1 as list
end if
… with this neat line …
set currentRecord_s to selected records
… which does what the six lines have done. Wow, that’s great!