Question about PDF compression

A raw PDF about nn MB in size and 300 DPI resolution from a Scansnap 1500 iX sent to ABBYY FineReader PDF with MRC results in 380 KB while sending it to DT 3.9 and compress it and reduce to 200 DPI results in 800 KB. In Abbyy i can chose compression of the pictures in three levels and MRC (which requires OCR) separately while in DT i can only enable compression and DPI. So i am not sure if MRC is used with the oem library. If it would be included could DT not make these options similarly controllable from the settings, please?

May I ask why you’d need three levels of compression as well as MRC?

Please define what “MRC” is. Thank you!

It’s a specialized type of compression ABBYY can use on some PDFs.

1 Like

I intend to digitize all my paper based documents and develop a mostly paperless personal workflow and in my mind i like to keep those documents as small as possible while maintaining an acceptable quality, which i get with average picture compression and MRC. For me that choice fits, but some documents i might accept a larger size. For others another choice might fit better. Comparing a Letter with some colored imprint is somewhere above 800k in 200 DPI within DT while 380 KB with MRC and medium picture compression in ABBYY standalone in 300 DPI. So i wondered if you use the same library why not open this choices to your users?

In another post i wondered why the imprinter when adding a small line of text is adding another 100kb for a small add like ‘Scanned: dd.mm.yyyy’ … for me size of docs still matters and i am really missing a single professional PDF and OCR solution on macOS - i hoped first that a Scansnap and DT would be sufficient, but for serveral occasions i found that you sometimes need PDFExpert, sometimes PDFpenPro or ABBYY standalone to get the desired result, which makes a workflow much harder to automate. This is why i ask for some of the options to be included into DT.

I hope this answers your question sufficiently?

Adding options to already crowded controls and preferences isn’t always an optimal idea. Development would have to assess the request.

I’m not a DEVONthink developer and can’t speak for them, but as a software developer of other things, I can answer in general terms why a given software tool might not provide access to all the controls and settings available in a software library it uses.

It often seems like it shouldn’t be too hard to add a few fields or sliders or buttons to a preference panel. Unfortunately, the “cost” of implementing those interfaces include more than just one-time developer time to (e.g.) create GUI elements in a preferences panel: they include things like debugging the additional code and behavior, writing test cases to test every new feature, continually updating the field/slider/button interfaces as the library is updated (a task made worse when the library changes its interface), documenting the user interface (and keeping the docs updated over time), answering user support questions, etc., etc. Product developers have to balance those costs against hard-to-predict gains in (say) number of users—or they have to count on increasing the price of the software to pay for developing and maintaining those features.

Perhaps you already know all these things; I hope this doesn’t come across as lecturing, because I don’t mean it that way. But, the question “why doesn’t software X have feature Y” comes up a lot on these forums, and I feel enough sympathy for DEVONtechologies to want to say something about how complicated it is to do things like expose more features of a 3rd-party library.

3 Likes

The Abbyy engine available for third-party developers does neither offer the same features as the standalone app nor is it updated at the same time.

Which engine are you currently using?

11.x and as Criss mentioned, there isn’t parity in features between what developers can use and what they offer in their standalone application, so it’s not a 1:1 comparison. Also, the Mac version always lags behind the Windows version, e.g., v16 of the FineReader app for Windows is available while Mac is still at v15. They do this with their developer frameworks too.

I verified with ABBYY support that MRC is available on macOS (Intel) since enginge 10. So this is about exposing some more options to be able to use this useful option. I kindly ask you to consider to expose these options. I would ease the workflow with a scansnap scanning via scansnap home directly to DT, allowing meta data to be entered and compressed with MRC directly and reduce the footprint with thousands of to be scanned documents.

Exerpt from engine 11 api:

Please note that depending on the chosen scenario some inner export settings can change. This will have influence on the value of the next PDFExportParams properties:

The interesting question would be: which of these options is available to third party licensees and at which cost?

I asked that question to ABBYY support, lets see what they come up with.
Fundamentally of course it depends on the contract between ABBYY and DT and could be only answered reliably between them in this context.

The more interesting question to me is technically can i somewhere in the OCR helper app, set those options in some settings file to check if would work in general even when not yet exposed in the DT GUI i wonder.

This is ABBYY Finereader PDF and the exposed options, which are not much more complicated for compression. While DT allows compression true/false i.e. on/off, ABBYY lets you specifiy three picture quality grades low, med, high, and in addtion if MRC should be used (only with OCR combined, as it is dependent on OCR).

The PDF format options in yellow are not that important to me but might be to others. Since we discuss exposing some engine options in the DT GUI, it might be considered as well.