importing image as OCR problem

I have a folder with 24 scanned pages of a document (scanned a few years ago with a non-Scansnap scanner). These are jpg images and when i open them in Preview, they look fine.

However, when I import them into DTPO (File > Import > Images (with OCR), they show up rotated 90° to the left. They are labeled as Kind = PDF+text. When I do to convert them into plain text (Data > Convert > to plain text) the “plain text” is complete garbage.

To fix this I opened the original JPG in Preview and rotated it through 360°and saved it. Importing it into DTPO now resulted in a properly-aligned image and the conversion went as expected.

What do I have to do - go rotate all my images (unnecessarily) through 360° before I can import them? This is absurd!

Thanks for any helpful suggestions, folks!

When it comes to using computer software to determine graphical aspects of images, nothing is absurd. The next release has AppleScript support to force the OCR engine to accepting certain rotations. This will not make it into the UI. There, I said it…

yeah but…

If my original image file is in portrait orientation when I view it with Preview, why on earth would it get stuck in landscape mode after being imported into DTPO? If it’s not absurd, then at least it’s not rational. If an image is longer than it is wide, I don’t think it’s a stretch of credibility to expect it to remain longer than wide after being imported.

Well, that’s “artificial intelligence” for you. For some reason the Abbyy software thinks you’ve got it wrong.

As always in the case of oddities like this, sending an example to support@devon-technologies.com is the best way to go.

Apple may be in part to blame. It is possible to rotate an image in iPhoto, view it in Preview, and then have it un-rotate itself when uploaded to a web page. (Note the lack of DevonTech involvement in the chain.) Under some circumstances, Apple will apparently store the “rotate” command in non-standard metadata, causing it to be lost when exported to a non-Apple environment.

Katherine

Ditto. I’ve seen this behavior before myself.

I think that Apple stores it in the proper EXIF parameter and other programs don’t bother to read that at all.
However in this case that has nothing to do with it. The Abbyy software automatically tries to determine the “proper” orientation based on whatever magic they use to determine this. I’ve only seen one other report where this failed but I couldn’t reproduce it.

@ Annard: I’ll shoot off an example image later today. I apologize for writing as if it’s all DT’s fault.

@ Katherine: Good points and observations - and thanks for pointing out the lack of DT involvement. I don’t use iPhoto (or much of any “photo” software) so I’ve never noticed this behavior outside of DTPO.

At any rate, I’ve done my workaround and finished that part of my project. Thanks to all for comments and insights.

[still mumbling about fumbling 49ers…]

Well shoot, Red Ryder…

The same thing happened about an hour ago with another 24 pages of JPG scans that were “portrait” when opened in Preview but turned 90° to the left when imported (with OCR) into my DTPO database.

Not wanting to compress and send my 2GB database I created a new one with the idea that I’d import three of the offending pages into it. To do this I selected three pages and moved them to the Desktop for easier access. I also created a zip file of them to attach to the Support email.

I nearly fell off my sturdy chair when the process of importing these three pages succeeded - they didn’t get imported turned around! So I created a new folder on the Desktop and copied the 24 pages into it, then imported these copies. Again I was amazed to see that they were correctly imported. So much for sending supporting evidence to Devon Tech.

I can only suppose that since the original scans were done over two years ago and lying moribund in a folder deep in my Finder’s hierarchy, odd bits of cosmic particles flipped some bits here and there.

So who knows, things are at least workable again. No email going to Support at this time.