IRIS 50 page limit

I just learned that there is a 50-page limit when importing PDF documents with text recognition. That came as quite a surprise! :frowning:

It limits the usefulnes of DEVONthink Pro Office significantly. Many of the documents I intend to OCR are around 150 to 200 pages.

Is there a way around it? I could imaging a hack using a script. However, I would prefer a more elegant solution. Would it be possible to have a Pro-Pro version that does not have the 50 page limit?

Best regards,

Hi, Markus. That’s a restriction of the OCR engine license from IRIS. They have the same 50-page restriction on their ReadIRIS program, but sell another version with no page limit for hundreds of dollars more. If DEVONtechnologies considers your suggestion for a no-page-limit version of the OCR engine, IRIS would indeed raise the license fee, making DTPO more expensive with that option.

When using my ScanSnap it’s difficult to stack more than 50 pages into the sheet feeder, so the need to split up long documents tends to get imposed for that reason, anyway.

I agree that the 50-page limit is inconvenient when OCR’ing a long PDF image-only file. One must split it into segments and then recombine them.

The Extras folder on the DTPO download image does contain a workflow that can be used to reassemble the segments of a long PDF. I expect we’ll see some ingenious users develop automated procedures for splitting and later recombining long PDFs.

A few of us were surprised by this limit but it’s not imposed by the folks at Devon.

I didn’t even realize that the full version of ReadIRIS Pro that I purchased has this limitation (until recently).
I would have never purchased ReadIRIS had I known.

So, we actually have the capability of the full version of IRIS Pro built in to DTPO.

I also own a full version of Adobe Acrobat. When I run up against the 50 pg limit, I use “open with” to open the file from DTPO in Acrobat, OCR it and save it to my HD.
I then re-import it to DTPO and delete the original. Not as elegant as an all-in-one solution but still very workable.

The alternatives seem to be Adobe full or perhaps OmniPagePro OR splitting up the document into 50 pg chunks prior to using the OCR function.

Hi, milhouse. I think I’ve got every OCR application for Mac that’s been released. Most of the early ones were really awful.

My favorite OCR app in OS 9 days was ABBY FineReader Pro, which had very good accuracy for the times. They aren’t selling a version for OS X, as of the last time I checked a couple of months ago.

I’ve got OmniPage Pro for OS X, although I just checked and found that I haven’t opened it since August, 2005, when I got ReadIRIS 9. I hadn’t been impressed with the accuracy of the OmniPage Pro version I was using. OmniPage Pro for OS X is currently priced at $499.95. To be fair, perhaps they’ve improved it since I stopped using it.

No OCR program available at the consumer level (or any other level, for that matter) is 100% accurate. That’s why I prefer the PDF+Text conversion, which has a built-in accuracy check – the 100% accurate image layer.

If I were to purchase the expensive, no page limit version of IRIS , could DTPro be set to utilize that engine instead?

No, but you could set up a workflow through AppleScript to import the results in DEVONthink Pro (Office). I doubt that they support Automator.

John, that page limit no longer exists in DT Pro Office.