Fix for which, the OCR inflating file size or annotation increasing PDF size by 10x?
I haven’t personally seen an annotation increase a PDF’s file size by 10x… and other than this years-old thread seen no other people complain. Have you been seeing this regularly?
As for OCR, it is well-documented on this forum that the ABBYY OCR engine in DTPO does rather dramatically increase file size, unless you apply rather aggressive compression (and all of the side0effects that come with that). While all OCR (Other than Adobe’s ClearText) will increase the file size of a PDF, sometimes rather dramatically, the ABBYY engine used here seems to be on the high end of this spectrum, for better or worse.
I tend not to use the OCR feature in DTPO for this reason… While I prefer Adobe ClearText I can’t be bothered to buy a full Acrobat subscription so I use PDF PenPro for larger things, which isn’t as accurate as DTPO and does increase the file size rather dramatically, but not quite as badly (though I’ll use DTPO is PDF Pen Pro does a bad job, which is sometimes the case).
If it’s just a receipt or some other small document I scan it with Scanner Pro which has OCR built in and keeps file size fairly low, though likely considerably less accurate than what DTPO/ABBYY is doing, which isn’t a huge deal on a well-scanned receipt, which isn’t too challenging from an OCR standpoint.
Thanks Jim and Scott for the quick replies. I was precisely having this issue a couple of times when annotating files I had OCR’d with Abbyy FineReader Pro 12 prior to indexing them with DTPO. I was aware of the OCR issues within DTPO, but had not thought they would impact annotations to this extent. The other day, I made a couple of annotations on a 20MB file and it would not stop saving until it grew up to 1GB. And yes, I applied a generous resolution to the pdf which meant a lesser compression. If I understand correctly, DTPO would save any changes to the pdf in a way that does not match the OCR settings for the pdf document, regardless of where you performed the OCR process. Seems I’ll have to open and annotate pdf files externally.
The issues with OCR (Not really an issue, just the reality of OCR that isn’t ClearText) is unrelated to the annotation issue.
I imagine a support ticket with an example PDF that you can reproduce this with could help them out. I’ve personally never seen this happen and just tried to reproduce it with a couple PDFs and couldn’t so its hard to say whether this is a very isolated issue or if it’s potentially widespread.
It is possible that it is linked to issues with PDFkit in Sierra, which is an Apple/macOS issue (and wouldn’t be the first time that Apple totally borked PDFkit).