OCR - Unable to create PDF export file

I have a recurring error that seems to be due to a memory leak. I have indexed a directory with 9500 pdf files. Of these ~950 do not have text attached. I have a smart group to select those files. When I select a small number, 5, say, and OCR to a searchable pdf all goes well the first time. However when I repeat this, it eventually fails with the error shown above. Repeatedly Freeing memory between batches allows it to run further but the same error eventually reoccurs, till every OCR job fails. Starting and stopping DT3 just brings me back to the beginning (the 1st conversion batches work, but eventually DT3 just reports the same error again.) I would like to select all the documents and do them as one batch.

Hardware: 2019 MBP, 16GB ram, 400 GB of 2TB SSD available.
Software: DT 3.8 Pro Edition, deleted and re-installed the ABBYY engine as suggested in this forum (for the licenses:0 error)

  • So the issue only occurs when doing large batches?
  • Are you running the trial edition (I’m guessing no but have to ask)?
  • multiple small batches also cause the error. If I monitor memory, it decreases with the number of papers, whether they are in small batches or a large one.

  • here’s the about window:

Also these are all professional level math/statistics paper. Lots of equations, greek mixed in with text (e.g. on the same line)

Please hold the Option key and choose Help > Report bug to start a support ticket and attach a problematic PDF for us to test. Thanks!

Bill, which macOS are you using? I’m reading an increasing number of reports on memory leaks in Monterey (search, e.g., on DuckDuckGo.)

There is not one problematic paper, its the volume.
@Bluefrog:

  • If I start DT3 and do a single paper, it works even for large books (500 pgs), even if that paper was the one that failed in a previous run.
  • If I start DT3 and do multiple single papers, one at a time, it eventually fails.
  • If I start DT3 and do a batch of, say, 10-15 papers, DT3 fails partway through the batch (usually)
    I can send you several papers if you wish.

@Blanc: I’m using Monterey 12.0.1. Thanks for the tip. I’ll check it out.

As suggested, I just filed a bug report (#699397) and attached a single paper.

The bug appears to be in the Finereader engine and support has opened a ticket with them. In the meantime, OCR works if you do single OCR jobs. no batches. Hardly fun with 950 documents. It does give me a chance to add tags, though.

1 Like

Thanks for reporting back :slight_smile:

I am having the same problem… any update

Do you have a support ticket open?