PDF size blown up after OCR?

Hi,

I am new to DT and testing the software. If I scan (Fujitsu ScanSnap) directly to DT, the PDF’s get recognised (up to the 20 a day test version limit). Today when I wanted to OCR the PDF’s in DT that did not OCR yesterday, I can see a considerable increase after OCR of the PDF size? A small receipt of 74Kb PDF gets after OCR blown up to 1MB??? Is this normal?

Best regards

SmartDust

Yes, as has been discussed in different threads on this forum. The way we set up the OCR process, the original image gets converted to a colour image. Hence the size increase. And before you ask: no we’re not planning to customise it. You could use an Automator workflow to decrease the size, there is an example on how to do this in the disk image.

When you OCR an existing PDF the original image is not retained, but a new one is created. The resulting PDF will generally be larger.

DTPO Preferences> OCR provides user options for the dpi and image quality of this New image. The default settings are 150 dpi and 50%. These settings are a compromise to reduce file size but retain readable and printable PDF files.

Most of my ScanSnap scans are done in black & white using the ‘Best’ image quality setting. Searchable PDF will average about 150 to 200 KB per 8.5 × 11 inch page, depending on content. Color scans will be larger.

I would not expect a PDF size of 1 MB for a small receipt, but perhaps 100 KB. Did you change those Preferences> OCR settings?

Bill,

Thanks for the reply… you’ve pointed me into the right direction. I fiddled with the DPI settings. I will reset them to default and see what happens.

Best Regards

SmartDust