Import MS Word Documents without RTF conversion

I want to import and keep inside the database a lot of PDF and MS Word Files and delete the originals files. However, when a MS Word files is imported it is also converted to a RTF file and loose tables and images (this is a very destructive import!). How do I to transfer the file inside the database (not indexing) so I can open it with Word (Open with…) in its original form?. Does DevonThink modify the inside content of any other file when imported?

As we also explain in our FAQ, the Word file format is proprietary to Microsoft and even Apple Pages and OpenOffice.org do just a 90 percent job in reading it. Importing Word documents without converting them to RTF is not possible at the moment (maybe Christian has some ideas?) Alternative: “Print” them to the database as PDF using our PDF Services script. This keeps the la<out, but the file is no longer editable.

DEVONthink 2.0 will store the original Word file as-is and use RTF only for displaying it.

Eric.

As Eric noted, it is not currently possible to store your Word files inside the database package. They will remain externally linked in the Finder and should not be deleted.

At this time, Word files are the only file type that cannot be copied into the database or the database Files folder if one uses the File > Import > Files & Folders capture mode. (Although one must check an option in Preferences > Edit to copy “unknown” file types into the database Files folder.)

As Eric noted, the option of Importing (copying) Word files into the database Files folder will become available in version 2.0. The Index mode will also continue to be available in version 2.0, leaving Indexed files external to the database.

Please note the following consequences of how Word files are captured into your database:

If you have Imported Word files using File > Import > Files & Folders (or drag & drop to DT/DT Pro) you should use Launch Path to open the original .doc file under Word, in order to edit the file and save it. Note that the edit changes will NOT be reflected in the database unless the modified document is again imported to DT/DT Pro.

If you have Indexed Word files using File > Index (or Command-Option-drag & drop) you should use Launch Path to edit the original .doc file under Word. But then the changes to the edited and saved Word document will be shown in the DT/DT Pro database next time that document is displayed in the database.

One-way synchronization from edited and saved Indexed (externally linked) files to their document display in the database is automatic. And of course the modified text content is available for searching and analysis.

Such one-way synchronization of those file types copied into the database Files folder using the Import mode also happens. That is, one can select a PDF document in the database, choose Actions > Launch Path, and annotate the PDF under Preview, then save the PDF. Next time that PDF document is selected in your database, the annotation will be visible. The same would be true of an image file; you may use Open With, select an image editing application and make changes. Next time the image is displayed in your database, the changes will be visible.

Thanks for the fast support and answers. From your discusion my best options before DT 2.0 are:

  1. Convert to PDF those Word Files that I won’t be editing and will like to import to the database.

  2. Index those Word files that I will be editing and keep outside the database.