Hi
I have a PDF document with non-English characters. I want to copy and paste part of the text into a new, plain-text document. Then, the non-English characters change.
This happens only when the source document is OCR’ed inside DevonThink, not to documents born as PDF/A.
A technical note: The words you see in the document don’t exist before OCR, except in your mind. It’s a picture of words. OCR is never 100% accurate and there is no perfect OCR engine.
That being said, what are your OCR preferences in DEVONthink?
Try this Shortcut to “extract” text from a selected area of the screen, utilizing one of macOS’s built-in services. From my experience working with Chinese text, this shortcut always gives better results than the text layer of the PDF itself.