Want the OCR of DT3 & DTTG3 to come back

Summary

View by → DT3 DT4 & Preview DTTG4 PDF Expert
OCR by DT3 Good Good Bad Good
OCR by DT4 Good Bad(Incl. PDFGear、PDF Reader Pro) Bad Good
OCR by DTTG3[1] Good Good Bad Good
OCR by DTTG4[1:1] Good Good Bad Good
OCR by PDFpen Good Good Good Good

I compared the OCR effect of different software on Chinese. Overall, the OCR effect of the 3rd version is very good, while the OCR effect and viewing effect of the 4th version are very poor, especially DTTG4.

The OCR itself (OCR Layer) might be good, but the viewing experience is not satisfactory.

It feels like ABBYY has not adapted to the new Apple OSs.

By the way, DT 3 is installed on macOS 11.

The specific situation is as follows:

OCR by PDFpen

PDFpenPro 12.2.3 Copyright ® 2003-2021 (The version from five years ago)
OCR Layer✅

OCR by PDFpen, Review by DTTG 4✅

OCR by DT4

OCR Layer✅

OCR by DT4, Review by Preview & DT4❌

OCR by DT4, Review by PDF Expert✅

OCR by DT4, Review by DTTG 4❌

OCR by DT4, Review by DT3✅

OCR by DTTG4

OCR Layer✅

OCR by DT3

OCR Layer✅

OCR by DT3, Review by DT4✅

OCR by DT3, Review by DTTG4❌

OCR by DTTG3

OCR Layer✅ (OCR Setting: English + Chinese)

OCR Layer❌ (OCR Setting: Chinese) (DTTG4 the Same)

DTTG 3&4 both have a problem: after OCR is completed, the whole page becomes larger (white part)


  1. The OCR setting on DTTG only opens Chinese, which is very poor. At the same time, English + Chinese is turned on, and the effect is greatly improved. ↩︎ ↩︎

The OCR engine of DEVONthink 3 & 4 is the same. Did you use the same settings and the same macOS version for testing?

Can I install DT4 & 3 on macOS 26 at the same time now? Will it affect the database or cause issues?
Now I’m using 4.2.2

I would recommend to install DEVONthink 3 in the Applications folder of a dedicated user account.

I’ve created a new user and am about to install DT3 and stopped :sweat_smile:. Let’s not take this risk.
At least, it is certain that files that have undergone DT4 OCR cannot be properly annotated in DTTG4 and DT4 either. (Chinese PDFs)

Did this really happen after upgrading to DEVONthink 4 or after upgrading to macOS 26? Tahoe is not known for being rock solid.

1 Like

Just to clarify: The previous comment wasn’t suggesting a methodology for your testing. It was giving you a recommendation on appropriately separating version 3 and 4.

I don’t quite remember. Because there was a time lag between the emergence of DT3, DT4, DTTG4 and macOS 26, and I didn’t use the OCR function of DT much before.

At first, I found that the files (Chinese PDFs) after OCR with DT3 couldn’t be annotated by DT3 itself (I reported this issue a few years ago, unless ForceEditablePDFS was enabled). Therefore, in the past, I mainly used PDFpen 12 for OCR.

After buying the M4 Mac last year, I found that the PDFpen 12 couldn’t be installed normally on the new machine. Therefore, alternative solutions need to be considered. The new version of Nitro PDF too expensive.

Later, I spent $70+ on PDF Reader Pro and used its 3.x version for OCR. DT could annotate, but it didn’t work after upgrading to 4.x.

At this point, I no longer wanted to spend money on this hassle. Therefore, I opened ForceEditablePDFs, but found that its compatibility with Chinese in Ver.4 (DT & DTTG) was too poor. DTTG 4 initially couldn’t recognize Chinese properly. After the feedback, you have made significant improvements and can now perform OCR normally (thank you very much), but it still can’t annotate normally at present.

Thank you for the clarification.
Yes, I understand that there are too many variables to test on different macOS versions. It would be best to test on the latest system – control variables.

I just have never used the function of “multi-user” and I’m worried that there will be unexpected problems. Therefore, stop continuing the test. If there is a new system in the future, I will try to install DT3 first and uninstall it after testing.

Could you share a document (before/after OCR) and a screenshot of your OCR settings? Thanks in advance!

The conclusion is the same as the Summary above.
Special notes:

  1. DT4 performs better on documents with similar fonts (such as provided standard files). If there are significant font, boldness, italics differences, etc. on the same page, the effect will be very poor (you can refer to the highlighted yellow parts).
  2. DTTG4 can only provide annotations in English (with notes “HL by DTTG4”).
  3. The file size becomes very large after DTTG4 OCR (2 times and 10 times)

The total files are quite large (34M) and cannot be attached here. Are there any other ways I can provide it to you?