change annoying naming conventions when converting

svsmailus · February 16, 2013, 6:02pm

I cannot seem to find how to change DT naming convention when converting. If I convert a html document it is renamed “docname.html doctype”. So if I convert “file.html” I get “file.html text” (if I convert to text.).

How do I stop DT doing this? It is not useful for me to have the file renamed as well as converted. This is a real problem when batch converting.

Bill_DeVille · February 16, 2013, 6:52pm

DEVONthink is behaving properly in the example you gave. The converted text file is a text file, not an HTML file.

My solution is to avoid adding filetype extensions to document Names, by choosing the option “Filename without extension” in DEVONthink’s Preferences > Import - Title. (Which, however, doesn’t affect the Names of items previously captured under another Preferences setting. DEVONthink will report their Kind as different – .HTML and .text)

Regardless of my Preferences settings in DEVONthink, if your example files are exported to the Finder their filenames will be differentiated by the filename suffix. That’s an OS X convention.

svsmailus · February 16, 2013, 6:58pm

I have now done this, however on conversion it still adds the filetype name to the new file. So if I have a file called, “Leadership Essentials” it becomes, “Leadership Essentials text”. How do I stop “text” being added to the filename?

Bill_DeVille · February 16, 2013, 8:23pm

You can delete the word “text” from the Name, if you wish. DEVONthink will allow that.

However, I suspect most users prefer to be able to differentiate between versions of a file that are produced by the Data > Convert to… feature. In the past, DEVONthink didn’t change the Name of a document (where the option to include filename extensions was not chosen) that produced a converted or duplicate copy. People complained about that. The developers responded, as in fact there are real differences between, for example, a PDF, a text file containing the text only of that PDF, and an HTML version of that PDF.

My databases contain a lot of PDFs resulting from scanning and OCR of paper copy to them. In some cases, blemishes or markups of the paper copy results in OCR text conversion errors that reduce the effectiveness of searches. In those cases I may create a text-only version of the PDF as a companion of the PDF in my database, with the text version edited to correct errors. For example, the string “Ludwig Holstein” might be found only in the edited text conversion of the PDF, and having the Name of the search result the same as that of the PDF (with the addition of “text”) is a cue that I’ve got a PDF with that Name. Suppose it’s a contract or tax-related document. The searchable PDF (even with text recognition errors in the text layer) would be acceptable for legal or documentation purposes, but the the text-only document would not be acceptable.

svsmailus · February 16, 2013, 8:31pm

Although I appreciate the usefulness from some users, it causes me difficulties as DT is part of my workflow and not the only app I use. When exporting items they have this “extra” in the filename that then needs removing. This is fine with individual documents, but with batches becomes a real problem. Besides this feature is fairly superfluous with the “kind” column telling you exactly what your are looking at as well as the changes in icon for the document.

Is there a way to turn this feature off?

Bill_DeVille · February 16, 2013, 9:58pm

Not currently.

Note that in the example I cited of an edited text file companion of an OCRed PDF that contains conversion errors, the Kind column in the Search results view would not give a sufficient cue to the existence of the PDF (at least in my workflows).