Stefan:
[1] I was glad to see Steven Johnson’s praise of DT’s contextual recognition “See Also” function in his NY Times essay. His practice of saving text fragments works well, and there is some logical support for his recommendation of his approach.
Personally, I don’t find it necessary to break up my existing documents into fragments in order to benefit from “See Also.” In fact, I’ve run into examples such that fragmenting a large document would have reduced the serendipitous value of “See Also.”
[2] Capturing text from multicolumn PDFs is a bit tricky, as Preview won’t let you select a single column. Text becomes selected across the page, picking up sections of all the page columns. You can only select one page at a time. You may have to do substantial editing; at the least, you may need to delete extraneous material.
Eric suggested two methods for capturing text from Preview. Of the two, I like the Services route, which can capture plain text to DEVONthink. This method has the advantage that it will place the PDF’s path into the Info panel, so that a link from DT to the original PDF file exists. Another major advantage is that the Services > DEVONthink > Append plain text operation is available, and allows selection of additional text to be appended after the first capture – for example, you could first capture the title page, then append selections from page 14 and page 53 of the PDF file into a single DEVONthink text document. If you wish to add links, first change the document formatting to rich text, then set your links.
Disadvantages: Usually, paragraph formatting will be lost. Each column may become a single paragraph, for example. Styles will be lost. Sometimes words will ‘run together’ and may require editing for DT to perform searches correctly.
[3] As Eric noted, TextLightning is sometimes problematic, perhaps depending on the version of the PDF document to be converted. RTF conversion by TextLightning is much slower than plain text conversion using the pdftotext option. But the paragraph formatting of the original document is preserved, as are character styles. The converted text may look strange, but is easier to read. Therefore, editing text to remove undesired material is much easier than editing plain text captures using option [2] above.
I have TextLightning 3.1, running under OS X 10.3.7. I have no problems importing RTF text from, for example, PDF versions of articles from Science Magazine. So I can’t account for your difficulties in getting TextLightning to work for you. Note: I generally use TextLightning conversions only for such multicolumn PDF files. More often, I use File > Index import to capture the text of complex PDF files, as the result is easy to read (but not to edit).