I am new to using DEVONthink, but am finding it very useful for storing information. However, I would find it even more useful if it could handle two-byte languages better.
Although DT can find Japanese text in RTF and TXT files, it is unable to do so for PDF files. Are there any plans to provide improved support for two-byte languages soon?
The most likely reason is that neither TextLightning nor pdftotext are able to retrieve the Japanese text of the PDF documents. If you could send us some example documents, we’ll check if there’s a way to improve this (but it’s actually not that likely).
I am testing the new version 1.9, and PDF2text works excellent, I can search all Japanese PDF-files now as if they were written with Roman characters.