Unfortunately, it’s beyond the scope of DT development to try to capture the text information content and also the ability to directly render and provide full editing capabilities for .doc Word documents. Even were Microsoft to release to developers the code to do that (they have not), the code size of the DT applications would become much larger. And then there are all the other applications in this Tower of Babel world for which we would like to provide similar features.
DT doesn’t convert .doc documents. Instead, DT uses a built-in feature of OS X to “read” the text content of .doc files – without images and without full format and layout of the original.
This does allow searching and analysis of that text content within a database that may also include related information from other file formats, e.g. PDF, HTML, plain or rich text and so on. That can be very valuable.
I agree that addition of images, format and layout often provide important information in addition to the “raw” text of a document. It’s possible, of course, to save such a document as PDF or HTML to enhance the information content as seen in the database. But that requires extra work on the part of the user, additional storage space, and extra steps to edit the original file and save the changes to the database.
My preferred ‘heavy-duty’ word processor is Papyrus 12, simply because it has a hybrid PDF format that allows me to see exactly the document as it was created, with images, layout, special formatting and so on. That’s because the file is read as PDF in the database (with working links, e.g. to endnotes), but remains fully editable within Papyrus, with edit changes immediately visible in the database. Wouldn’t it be wonderful if Microsoft were to do something like that?
DEVONtechnologies hopes to add more “known” filetypes in the future, so as to allow at a minimum text capture from additional file types, and perhaps improved rendering capabilities of some of those, as well. To the extent that developers “wall off” their products with proprietary file types, that remains a difficult task.