Odd PDF Behavior

Hello everyone,

I’ve noticed certain PDF documents of mine will not render correctly within DTPO, but will render fine in DTtG 2. More specifically, they’re documents I OCR’d using ABBYY FineReader Pro 12 and saved using MRC compression. Its not the whole document either, but just random pages. Probably less than 1% of the document.

I run into the same problem when opening the documents in Preview and Skim. But they open fine in Adobe Acrobat Reader, Chrome, and Firefox.

On iOS, DTtG 2 and Adobe Acrobat are the only apps the documents render correctly in. All others, like Goodreader and PDF Expert, fail to render correctly.

Anyone know what could be causing this?

Likely more problems from Apple’s broken PDFKit in Sierra, a break that has indeed affected ABBYY OCR’d documents - but usually destroying the text layer.

What does “not render correctly” mean? Have you tried DEVONthink’s OCR (ABBYY, but not the same). If the documents also have problems sometimes in other apps then it’s very doubtful this is a DEVONthink issue. Have you contacted ABBYY and sent them before- and after-OCR samples?

Bluefrog,

That’s what I was thinking too. The problem seems to happen both on El Capitan and Sierra. I’m gonna see what older operating systems do with it. I think I have Lion on an old laptop somewhere. I assume DEVONThink uses the PDFKit then?

Interestingly enough, I sent sample documents to ABBYY and they were able to open them in Preview without any problems. So that has me stumped! Every Mac I’ve tried them on (iMac, MacBook Pro, Mac mini, iMac at the Apple Store) wasn’t able to render them. I’ll include an image of their tests.

So either the problem is limited just to my own little world, which is ridiculous, or maybe the ABBYY people fixed their PDFKit on their Macs. I know there’s a way to transplant an older PDFKit into Sierra.

Korm,

I haven’t tried DT’s version of OCR yet. The problem lies in rendering anything with MRC compression. When I don’t use that setting, everything comes out fine. But I get a PDF about ten times larger as a result, which isn’t preferred.

I’m including an attachment for what I mean about the PDFs not rendering correctly. The left image is a JPEG, the right a PDF with OCR and MRC compression. Even though most of the text isn’t visible, the OCR text is still selectable.

So if DT uses Apple’s PDFKit, then that’s probably where the problem lies. Kind of annoying, but I can do a few work arounds until Apple hopefully fixes the issue.

So I just opened up an old macbook running Lion and tried to open the problematic PDFs…and they worked!

So it looks like PDFkit at least in Lion was working better than it is now in El Capitan and Sierra.

Not ridiculous at all. Computers are dynamic environments and it’s not unusual for one machine to have issues that aren’t reproducible on other machines.