When OCRing a paper document with Readdle’ Scanner Pro the resulting electronic document is of the kind PDF+Text.
The text part is not always perfect. - Is it possible to edit this text component? If there is no way to do this in DTP, does anybody know of external (command line) tools to edit or inject other text?
Not a perfect solution, but you can convert the PDF+Text document to an RTF or plain text document (Data > Convert). These can be corrected so accurate text can be made available for See Also etc (and of course you can embed a link to the original).
Not ideal, but perhaps worth it for some documents.
PDF Expert for the Mac is not cheap, but it allows you to edit the text of a PDF. I’ve used this in a surprisingly large number of instances and it’s basically like magic. If you find yourself needing to edit the text of PDFs often, it’s worth the investment for that feature alone.
Actually PDF Pen from Smile allows you to edit the OCR layer (that is, the “hidden” layer of recognized text). Haven’t done a ton of that, so the sample is small, but I’ve yet to encounter a PDF with an OCR layer it can’t edit (I have encountered some where the OCR layer is hopelessly garbled, but that’s not PDF Pen’s problem!
Here’s an invoice scanned and OCR’d with Scanner Pro, viewed in PDF Pen Pro using the “View OCR Layer” view.
The scan is obviously black and white and the OCR’d text is represented in blue, you can just barely see the ghost of the actual image below the OCR layer. All of that blue text is editable, but changes made to the OCR layer obviously won’t be reflected in the image layer (But, of course, will be reflected in searches!
Now if only we could get all those features into a single app!