I try to use OCR to convert a image-based PDF into searchable PDF, in settings, I unchecked “compress PDF”, DPI=300. The original file is 14mb, after convert, it’s only 4mb and image inside is a little bit blur.
I don’t need any quality loss, just adding searchable ability, how can I do that?
Thanks for sending the file. It looks like there. are some addition artefacts around the text that could have been introduced during the page extraction. I have added a change that should reduce this.
But how come I think the image quality reduced a little bit after conversion (file size from 12mb → 4mb)? Possible remain the image quality but extract texts?
The difference between the original size which was generated with macOS PDFKit and the OCR’d file is that the ABBYY OCR has a significantly better compression than PDFKit.
ABBYY will always apply some compression. If the “Compress PDF” option is off this relates to the final PDF size in two ways:
If metadata is added or transferred from the original file the saved file will not be compressed.
In ABBYY OCR when generating the PDF, the priority is set for quality over size, however it is ABBYY that determines the actual amount of compression applied to the final file.