OCR Helper causes DT3 to freeze

I began the conversion of a scanned page PDF into searchable text, but then realised the document had way more pages than I thought and I decided to do it another time.

I clicked the X next to the activity in the bottom left of the main window and the conversion stopped.

When I tried to convert a smaller document I was then greeted by the spinning beachball and had to force quit DT3.

Going back into DT3 and trying to OCR another PDF gave me the beachball again.

In Activity Monitor, the DTOCRHelper and DT3 were both marked in red, the helper was using 98% of the CPU. After force quitting the helper, DT3 came back to life and I was able to continue working on some smaller files, without getting the beachball again.

Which version of DEVONthink and of the helper (see ~/Library/Application Support/DEVONthink 3/Abbyy) do you use? How much RAM did they use according to the Activity Monitor? It’s possible that virtual memory caused the beach balling.

DT3 - 3.5.1
OCR Helper - 1.1.2
MacOS - 10.15.6

This is what I tried just now:

  • Start OCR on a file
  • Cancel OCR on the file
  • Start OCR on another file
  • DT3 is stuck in ‘Adding document’
  • Quit DT3
  • In Activity Monitor, DTCOCRHelper still active, and is consuming more and more memory (1.3GB+)
  • Machine shows 5GB used out of 8GB in Activity Monitor
  • Start DT3 again, DTOCRHelper is still running at this point
  • Start OCR on another file
  • DT3 becomes unresponsive in Activity Monitor (Beachball appears)
  • DTOCRHelper also unresponsive in Activity Monitor
  • Force quit both in Activity Monitor

In one test, if you wait long enough, DT3 and OCRHelper become responsive again.

In another test, if you quit DT3 and stay in Activity Monitor and wait long enough, eventually DTOCRHelper disappears. DT3 can then be started again without issue and the next OCR process works fine.

Might be a coincidence, but size of the original PDF file seems to be a factor. It’s almost as though the OCR Helper becomes a zombie and keeps processing the file regardless even after you cancel and quit DT3.

Also I’m wondering that if you start DT3 while the OCR Helper is in this zombie state, things get completely stuck.

Can you send me a copy of the OCRLog.txt file that is located in the folder ~/Library/Application Support/DEVONthink 3/Abbyy

Hi,

I only see two files in that folder:

  • DTOCRHelper (Application)
  • languages.plist (Property List)

Is the PDF document that you are trying to OCR sharable?

You can download this file, although its seems to affect any PDF file I’ve tried so far.

https://1drv.ms/b/s!Ah7fj9CZV0zyrgDzYNC5oQrpNkEI?e=JaEV3r

I think as along as you can cancel the process, and then try to OCR something else while the helper is still busy it will pause.

Likewise it you cancel an OCR, exit DT3, go back into DT3 and try to OCR while the OCR Helper is still running in Activity Monitor, then I get the beachball.

Can you turn on OCR logging, to do that:

  • Quit DEVONthink
  • In Finder select the menu Go->Go to Folder, copy and paste the line below and press Go.
    ~/Library/Application Support/DEVONthink 3/Abbyy
  • Copy the following file to this folder OCR.plist (274 Bytes)
  • Restart DEVONthink.

When you OCR a document and you encounter the issue, could you send me a copy of the OCRLog .txt file that will be generated in the ~/Library/Application Support/DEVONthink 3/Abbyy folder.

Log file can be found here:

https://1drv.ms/u/s!Ah7fj9CZV0zyrgHW3Ksn_ULcRInF?e=V8mf4j

This was the sequence:

  • In DT3, start OCR of a larger PDF file
  • Abort the OCR process from DT3 (lower left panel)
  • Try and OCR another smaller document, DT3 says ‘Adding Document’ but nothing happens
  • Exit DT3
  • Load DT3
  • Try and OCR smaller document again, this time get beachball
  • In Activity Monitor, DTOCRHelper and DT3 are not responding
  • After a long wait, they both begin responding
  • DT3 starts to OCR the smaller document

From memory, the delay with the beachball is on par with the time it would taken to OCR the first big document had I not tried to stop the activity.

This issue has been fixed and will be included in the next update. The cancelling of the OCR in v1.1.2 will only happen at certain stages and in a large document that could take some time. In the next update the document should be cancelled within a few seconds

Thanks for the update, much appreciated.