Degraded texts after OCR

When I re-OCR articles (pdf+text files that are downloaded from web sites) from within DT3.01 (contextual menu OCR->to searchable pdf), the quality of the recognized texts is always improved but the readability of fonts is almost always degraded to a “poor” level (for a 4K 32" display). For reading literature that typically uses smaller font-size, the degraded texts are causing a lot of eye strain. I am just wondering if there is a way to change such behavior?

EDITED: it seems that the degradation differs from page to page, some pages are more degraded than others.

Thanks in advance

See the attached example for the quality before and after the re-OCR:

The difference in quality between two consecutive pages:

My OCR setting:

3 Likes

Hi, I was just about to post exactely the same question. OCR really decreases image quality in my pdfs. And exactly like you I already unchecked the “compress pdf” checkbox, but I can’t see any difference. Looking forward to the solution or hints where to set a better image quality of the ocr’d pdfs.

1 Like

If “Compress PDF” is off the ABBYY OCR will use the maximum quality setting and if metadata is added or transferred the PDF is saved without compression. Do you have a sample file that I can use to investigate this issue?

I just send the before and after files to u via PM. Thanks
BTW, this OCR quality issues don’t happen to one particular file but to all re-OCR files at least in my case.

Hi, just send you two other example files to examine via PM. Thanks! And contrary to ngan it doesn’t happen to all files in my case.

You might want to check a few documents on a page-by-page basis. The quality of texts differs in different pages… FYI only.

Thanks for the files.

I can confirm that the issue appears after the ABBYY OCR stage. I have contacted their support and am awaiting their response.

Thanks. Looking forward to their reply.

Hi @aedwards, are there any news from ABBYY?

They have reproduced the issue and we are currently working through different options with them to improve the output quality of the image.

Thanks for the feedback!

Just tried FineReader 12. When you turn off MRC compression and choose High Quality the picture becomes visually more comfortable (edges are blurred in greater magnification). But the file becomes 3 times larger )

You may send me your file, I’ll OCR it with these settings and send it back to you

Instead of just a “compress PDF” toggle, could there be a slider to choose how much compression (sort of like JPG options)?

2 Likes

The request is noted, but no promises. :slight_smile:

1 Like

ist gelöst / is solved