Issue when batch OCRing large numbers of mixed pdf and image files with rule

Hi there, I have been try ing to migrate a load of my Evernote files into DT3. Ive managed to get them in and clean them ups using Bluefrog’s helpful rule. All good thanks. Next I wanted to ensure any image files that come across are fully OCRd. So using the rule from another user I successfully batched things up and it was all working fine. Halfway through however, the system hanged and I had to abort the rule. I tried looking at the file on which it was hanging. I applied OCR individually and it work fine. I then started the rule again to see if I can get the remaining 70 docs processed, but it refuses to start. It is showing 70 left, but nothing happens. I have the activity monitor up all the time… plenty of activity when processing successfully, but nothing now its stopped (unsurprisingly!). Ive tried to get this to process (350ish docs) a few times, but it always seems to bomb out at some point. Does anyone have any suggestions why this may be? Here is the rule:

Screenshot 2023-01-21 at 00.28.02

Does restarting DEVONthink or rebooting the computer make a difference? Anything logged to Windows > Log?

Thanks. Nothing in the log, but yes, a restart of DT3 cured it and the rule runs again. My concern is that I would like to batch process 1000’s of notes and if I can’t work out why this is happening, I may end up coming back the next day to find only some of the batch has been processed before it bombed out. I have repeated the import of the same Evernote notebook a few times now and followed the same procedure, and the same thing happens every time Ive tried so far. Is there anything else I could try to get to the bottom of it? Without log details it’s difficult to know what is going wrong.

On the Evernote import point, Im also noticing that some of my Evernote notes are coming across as formatted notes. Some, but not all, of the images contained within these notes are losing their perspective ratios upon import and thus appear distorted. I can manually resize each when when viewing to approx the correct ratio, but they still appear somewhat blurred when compared to the original in Evernote. Strange! Any ideads what may be going on here?

Many thanks again.

Actually a bit more testing and resizing indicates that its the perspective ratio of the original image note that is being lost - blurring is the same in original and imported whether in EN or DT3… So are other people also experiencing loss of ratio info sometimes upon .enex import? Will loss of perspective ratio info affect OCR performance… thanks!

A screenshot of the rule would be useful. In addition, do you use any additional rules?

Thanks, I assume you mean the Evernote cleanup rule (OCR one is at top of thread). In which case here it is:

Screenshot 2023-01-21 at 16.18.53

Screenshot 2023-01-21 at 16.20.51

Only other rule I use (on demand only) is:

Screenshot 2023-01-21 at 16.22.42

Thanks again.

The script most likely does not always work as expected as it’s using parent 1. This might be both a group and a tag, instead the property location group is recommended.

Did you try importing ENEX files without performing OCR? Are the image quality & aspect ratio as expected in this case?

Thanks again. The script seems to work fine every time Ive tried it - I’m not sure that would affect the spect ratio issue. Ive just tried creating a new Evernote notebook, transferred one of the offending files to it and then simply imported it manually into DT3 with no OCR (using Import>files and folders). Results in same issue - aspect ratio is lost!

To be honest it is a low proportion of the formatted notes that import from the Evernote notebook that this loss of aspect ratio seems to affect, the rest are fine. I just can’t fathom why it would be the case? Maybe it’s at the Evernote end and certain images have been stored in such a way that looses the ratio when imported to DT3 (they are fine in EN). Is there anyway to check out by looking at some metadata - either in DT3 or Evernote? Many thanks.

Files are actually imported without any modification. Any chance that you could send an example ENEX file to cgrunenberg - at - devon-technologies.com? Thank you!

Sorry for delay - busy week! Will do asap, thanks v much