Japanese OCR

Abbyy has anounced that the Fine Reader from 9.0 on can handle Chinese and Japanese. Devon uses 8.X. Will 9.0 become part of Devon too and will Devon than be able to handle Japanese? I can imagine that this will attract a lot of new customers in Japan if Devon is localized (is it already). As I am into East Asian Studies, I would appreciate it too very much :wink:

While on Windows Asian languages are part of the engine, ABBYY charges for Asian languages as well as Arabic on the Mac. So for the moment we have no plans to add them as it would mean to rise the price of our application, but we will continue to incorporate all updates from ABBYY and as soon as they add e.g. Asian languages to the package DEVONthink Pro Office will automatically inherit this ability.

here need KOREAN language too… :smiley:

Hi

I was about to post the exact same topic and found that there are already requests for OCR for Asian languages.

Specifically, I am after OCR for Japanese. If Devon Technologies wants to take the lead, OCR for Japanese is a must. After switching from Windows to the Mac, I was amazed there are no full versions of OCR for Japanese for the Mac. Since Devon Technologies is already selling software in Japan (at a much higher price as you know), I bought DevonThink Office Pro expecting the Japanese OCR to become available sometime during the betas. This is because Abby offers Japanese and other Asian language OCR support for the updated DevonThink as stated in the very first post.

Could you please, please, please, please x 1000 pleases, include Japanese OCR? It would make DevonThink a leader even in OCR for the Mac in Japan. DevonTechnologies, want to be a leader? Now is your chance.

Hoping to hear some good news! :smiley:

Darren McDonald
Tokyo

Hello

Any further word about adding Japanese OCR?

If it costs more to add Japanese OCR, why not provide a version of DEVONthink that offers Japanese (and other Asian languages) OCR?

Now that you are still in the beta testing stage, it would be nice to try it out and have this included. Otherwise once it is released, my fear is that development will be point on hold and thus thoughts about Japanese OCR pushed far away.

Since DEVONtechnologies has such a global reach, I would hope you would not discriminate against non-English/European languages. That would not be a good message to put out by non-inclusion.

Looking forward to some response rather than the silence that has ensued about this matter.

Darren
Tokyo

Hmmm? Twiddling thumbs … :unamused:

The status is still as Eric noted earlier in this thread. We would be delighted if ABBYY were to extend language coverage under the Mac license agreement.

Me, too, I’d be delighted if DEVONthink would incorporate Asian (Chinese, in my case) OCR. I’d be also more than willing to pay a higher price.

This is, in my view, the only major feature missing so far. With Chinese OCR, DEVONthink would definitely become part of my daily workflow.

Here’s another user looking for Japanese OCR functionality. Just a heads up.

Although, I’m not sure how valuable such functionality would be given that Devonthink’s AI isn’t all that useful for Japanese. Or at least, hasn’t seemed to be, in the examples I’ve thrown at it. Japanese, a language without spaces to separate words and verbs and adjectives that are conjugated and thus, even when they’re the same word would not be classified as such really trip up the software. It’s not impossible to make it work, if you have a large source dictionary (like edict manages to, for instance) but right now I’m not sure if a Japanese OCR would be of much use to us looking to examine Japanese documents under the Devonthink looking glass.

Note: You should be able to use Wildcards to search Japanese documents that don’t have spaces between words. Example: word will find the desired term.

Wildcard search seems to work sometimes… But the real problem is that Devonthink doesn’t classify Japanese words as words with any regularity. I have paychecks that contain OCR’d text that Adobe Reader recognizes as separate words, but Devonthink doesn’t, even when there are spaces in-between certain elements.

It can’t be that hard to distinguish between words, or at least make an effort, as I know a solitary individual who went wrote a script that allowed him to make a frequency database from the contents of Japanese wikipedia, but right now Devonthink just doesn’t have the ability to do much at all with Japanese text, which is sort of unfortunate, given that that was one of the primary reasons I invested in 2.0. (Don’t get me wrong-- it’s not the only reason and Devon is great at a lot of other stuff i use it for, but my hope-- the ability to compare Japanese texts using the AI fell short.)

any updates on OCR for CJK languages? i’d also be willing to pay additional for this - it’s the only thing missing!

thanks
/tt

If I buy the software on my own will Devonthink take advantage of the Japanese OCR support? Is there some way to do this by updating the library you use?

any updates on CJK OCR? please?

I’m still waiting for Japanese OCR. Devonthink unfortunately doesn’t really get any use currently due to lack of a good OCR package.

Fortunately I’ve been in contact with ABBYY. They were planning on releasing a new version of their Mac product the second quarter of 2013, but that’s been pushed back to the last half of this year. I’m planning on using it for OCR and automating the import into Devonthink with something like Hazel.

Hello Charleslaacz and DevonTechnologies,

Thanks for the information about ABBYY releasing a new version of their Mac version.

Did the response you got from ABBYY say that Japanese OCR support would be included in the new Mac version?

If this is the case, could you describe your workflow for importing from this package into DevonThink using Hazel?

I have just sent an email too enquiring about both the Abbyy customer software and the OCR engine provided to Mac developers.

If DEVONtechnologies is reading this post, they may be interested that even major developers of Windows software are about to release native Mac versions of their same software.

A point in case is QSR International, THE leader in Qualitative Data Analysis software with their NVivo package. (NVivo is somewhat comparable to to DevonThink in some features). Last week, QSR showcased their Mac version at a prerelease event in Tokyo. QSR International have been rather astonished by the massive growth in Japan of the Mac. :open_mouth: The demand has been so great that they are just about to open an office in Tokyo! :slight_smile:

Here is a link to a preview of their Mac version of the Nvivo native Mac version.

nvivoformac.com

Anyway, looking forward to hearing from you Charleslaacz. :smiley:

Darren

Darren McDonald
Professor of Human Resource Management
Daito Bunka University
Japan

Hello again,

Just received the following reply from ABBYY. Now DEVONtechnologies have it straight from the horse’s mouth. The new OCR engine will include Asian languages, including Japanese from November! :smiley:

Please DEVONtechnologies, will you now get to work instead of your users having to do your job for you?! :unamused:

Kindly advise when an update including the new OCR engine will be made available.


MESSAGE FROM ABBYY:

Thank you for having an interest in ABBYY OCR products.
We understand expansion of influence to Japanese market of Mac well, and have some plan of strengthened our application to OSX.
Unfortunately, application of Asian languages in FineReader Express Edition for Mac is not planned yet. However, we have a plan to release new version of FineReader11 Professional Edition for Mac that include Asian languages. That applied 185 languages including Japanese, and we schedule release in Q1 of next year.
And also, we have a plan to release FineReader Engine 11 for Mac on this November. That can recognize 198 languages applied to many Asian languages including Japanese.

Darren McDonald
Professor of Human Resource Management
Daito Bunka University
Japan

Just reviving this a bit - looks like Chinese/Japanese should have been included recently? Can anybody confirm that DTPO now OCRs these languages correctly, and allows indexing and search? Critical for my workflow…

I see the full version of Finereader supports Japanese. If its critical you might consider using it:

http://finereader.abbyy.com/pro_for_mac/tech_specs/

I use third party OCR all the time with DT and its not a problem.

Frederiko

Just jumping in to add another vote for CJK support, if it’s now covered by the ABBYY license for Mac. I’ve got a ton of scans in traditional and simplified Chinese (as well as scans of English documents containing Chinese characters), and it would be great to have a place to put them where I could count on them being more or less automatically processed. Using third-party apps works too, but adds a few steps to the process.