Is it possible to 1) view the Concordance for one specific item (an OCRed PDF, in this case) and 2) edit the OCRed text for that PDF? I’ve got some PDFs that didn’t get OCRed quite right and I’d like to correct a couple mistakes.
I’ve been wishing for many years for a practical way to edit existing PDFs to correct OCR errors. I’m still waiting.
It is, however, possible to produce an editable version of the text content of a PDF in DEVONthink Office - Data > Convert > to rich (or plain) text. This generates a new , searchable text document.
You might then create a new Annotation note associate with a PDF named “X”. The name of the Annotation document will be “X Annotation”, which you might change to “X Annotation OCR Corrections”. Copy/paste that converted text into the Annotation note and it will be lined to the PDF, and also linked from the PDF.
Scenario: The PDF was a contract, but OCR failed to include several terms correctly. Of course, the PDF image layer is faithful to the original scanned contract, but if those terms are important for searches, information has been lost. Providing the Annotation note containing edited and corrected text will allow searches that will reveal the “lost” terms.
I’m filling out PDF service orders on my iPad and emailing them to the customer, with a copy to me. Mail saves the attachment and deletes the email. A folder action imports the PDF into DTO and OCRs it. I want to be able to search by serial number or equipment type, but those two items invariably get garbled.
You’re right, yours is not a perfect solution, but with a little work it fits into my workflow. I think, though, I’ll take a look at creating a template that has those items – and then link that to each service order.