Two OCR Questions in DT3

lutefish · May 13, 2019, 9:23pm

I have two vaguely related questions about the OCR functions of DT3 (which, as I wasn’t using DT2 Pro Office, is new to me).

I have an indexed folder containing PDFs. Occasionally, I’ll add a new file to that folder, and see in DT3 that its identified as a “PDF”, not a “PDF+Text.” I can OCR the item without issue, by right-clicking and choosing OCR->to searchable PDF. In doing so, however, it creates a new version of the file (with a “-1” appended). I’ve checked the “Move to Trash” option in preferences, but the file remains (with its new twin) in the Finder. Am I missing something on how to automatically delete the original un-OCRd underlying file and have only the PDF+Text file remain?
I sometimes drag images of text from a web browser to notes in DT3. Ultimately, I’d like to OCR the text of these files, so they obviously need to be converted from “Dragged Image.tiff” to something first. It may be because I’m dragging them to existing RTF notes that I’m getting stuck at this point in the process. Is there a smarter/better/internal way to do the conversion within DT3, rather than opening in Preview, exporting as a PDF, re-importing to DT3, running OCR on it?

Many thanks for your help - all of the new features in DT3 are so very welcome.

mr_drlove · May 14, 2019, 1:52pm

Maybe I can help you for question 1.:

I also work like you for some PDFs. I often store my scanned papier in PDFs, import them in my DTPO structure and start the OCR. At works for me without getting an additional file.
But sometimes this happens, however the reasons for that are two issues:

you have PDF/A documents which is not allowed for editing. So a copy will be created
The PDF is with security settings or simpe “read-only”

So have a closer look to you pdfs perhaps one of these cases also fits to your secenario.

best wishes
mr_drlove

aedwards · May 14, 2019, 3:52pm

The current beta doesn’t delete the original file when OCRing indexed files.

We are unable to OCR embedded images within a RTF file however we can OCR a TIFF image to either PDF, RTF or Word formats.

lutefish · May 14, 2019, 4:20pm

Many thanks for the help on both questions.