There are a great many file formats in the Mac world, and many of them contain text, images and other information.
DT Pro depends heavily on OS X to capture text from, and even render, some of those file types. Which one? First, many (but not all) text-based files such as plain text, RTF, RTFD, csv and a good many more. For many of these DT Pro relies on Apple’s Cocoa text code. Second, HTML and WebArchive files permit text capture, and even pretty good rendering, of those file types using Apple’s WebKit code. Third, DT Pro can read the text of and render very well PDF files, using Apple’s PDFKit code.
Some file types such as Microsoft Word .doc files can be partially captured, but not rendered in their original form. OS X can read and capture RTF text from Word, but not all the attributes of a Word document. So DT Pro can’t render the “full” appearance and content of a Word file. To do that would require, in essence, building a Word application (or fully compatible Word application) into DT Pro.
DT Pro can’t even capture the text of Pages, Mellel, Powerpoint, KeyNote, OmniOutliner 3 and many other proprietary file types at this time, much less render their files.
I expect that future versions of DT Pro will be able to capture text from, and perhaps even render more or less well additional file types. To the degree that’s possible, it will assist users to expand the present universe of file types recognized by DT Pro and DT PE, so that the information content of their databases can grow.
At the moment, PDF files offer the most universal way to capture information from unrecognized file types, and DT Pro includes a script that will let one capture the information content, including text, images, format and layout of “unrecognized” documents, using the “print to PDF” capability of OS X. So one’s Pages, Mellel, KeyNote, Excel or even Word documents can be captured into a database in that way. That does, of course, require a conversion, hence a duplication of files. But it does permit one to capture important information into a database.
Until very recently there simply has been no word processor above the level of TextEdit that could produce files fully compatible with DT Pro. By fully compatible I mean not only that the text and other content such as layout and images could be viewed in DT Pro just as designed in the word processor, but that the file viewed in DT Pro is completely editable by and synchronized with the word processor, so that changes to the file become available to the database when the file is saved.
Papyrus 12 is the first word processor that meets my “fully compatible” criteria, as it can produce fully editable PDF files. It’s a very competent word processor with good capabilities for outlining and lists using styles, footnotes and endnotes, bibliographic cites (using a built-in database), useful layout options, hyperlink management and capable table and spreadsheet features.
I still like to do most of my writing and editing inside DT Pro using rich text, as my information resources are right at my fingertips. For a while I’ve been using Pages to polish the final version. Now I copy my rich text notes and paste them into Papyrus 12. When images are involved, the copy/paste isn’t as convenient into Papyrus as into Pages, as I have to make a second pass to copy/paste in an image.
But that inconvenience is more than made up for by the much more powerful capabilities of Papyrus and by the fact that every feature of the resulting document – text, images, layout, hyperlinks, etc. – is there in my database and is completely editable with the click of a button in the database.
One of my volunteer projects involves generation of dozens of PDFs which are then reviewed by health care professionals, who are generous with suggestions for revisions. I started that project using Pages to produce the PDFs. To make a revision I had to find the original Pages file, make and save revisions, then export a new PDF, import it into a database and delete the older version. I quickly changed all the documents from Pages to Papyrus. Now when I get a revision request I find the PDF by title, open it under Papyrus, make the revisions and save the document. There it is in my database – just one file!
By the fifth edit of a document, one’s appreciation for editable PDFs as opposed to the previous Pages workflow has grown very considerably.
When all the PDFs are finished they will be distributed on CD. I’ll use Papyrus to generate a hyperlinked Table of Contents PDF that will link to all of the individual PDFs. And because PDFs are platform-independent, users can choose either to print out the documents or use them on a computer.