Thatâs the rub, and unfortunately the answer is no.
During the OCR process the image layer is recreated.
The default dpi/image quality settings in DTPO2 Preferences > OCR represent a compromise to save the searchable PDF with reasonably readable image quality, but without a huge increase in file size.
Annard discovered that with the ABBYY OCR code he could have a very sharp PDF image and very compact file size â but at the expense of completely throwing the original image away and substituting a PDF image of the recognized text only. That would not be acceptable, as the image of the original image should be considered as a faithful representation, whereas the recognized text image could contain errors, leave out images, etc.
If I scan a contract into my database, itâs important to me that the image of the contract should be faithful to the paper copy and contain, for example, any handwritten initials and signatures. If thereâs a dispute about that contract, the image layer is the âmasterâ for resolving the dispute. But the OCRd text might contain errors resulting from the proximity of handwritten initials overlaying text, or a coffee stain. The image rules.
Iâve got Preferences > OCR set to 150 dip and 75% image quality. So I can read that contract comfortably and a printout is readable. Not as sharp as the original, but I can live with it.
I do my ScanSnap scans usually set in ScanSnap Manager for black & white at 600 dpi, and in the âCompressionâ tab of ScanSnap Manager Settings Iâve got the Compression slider all the way to the left. That increases the size of the file sent to DTPO2 for OCR (good for recognition accuracy of color content), but doesnât pose much of a penalty in the final searchable PDF stored in my database.
The view/print image quality of ABBYYâs output is better than output of the IRIS OCR we used in DTPO 1.x and the accuracy of ABBYY OCR is much better; yet my searchable PDFs produced by ABBYY are considerably smaller than those produced by IRIS in DTPO 1.x.
One of these days in the future the technology will advance, so that the sharpness and file size of OCRd copy will be as good as I can produce by exporting a Pages document as PDF. And â something Iâve been hoping for for years â one will be able to correct errors in the text layer without modifying the image layer.