Migrating from Evernote - Re-OCR?

I actually did rescan mine, but I think you may sort of be answering your own question here:

  • DT will often times produce a smaller file size, which will be of reduced quality (which might or might not be visible to you). The DT guys feel they have optimally balanced size/quality. YMMV.
  • I found the text recognition quality in the files I re-OCRd to be better than in those which I had simply imported (but I can’t remember whether I used ScanSnap Manager or Evernote to perform the OCR originally - in any case the OCR was performed years ago, so a newer engine might perform better)

You can set DT to move the original to trash following OCR (Preferences/OCR -> Original Document :ballot_box_with_check:Move to Trash), so DT wouldn’t (visibly) create an extra document for each OCR. I recently used DT3b2 to OCR approx. 1800 PDFs, which it quietly did in the background with no apparent problems. An idea might be: if you need the extra space or feel the current OCR quality is not ideal, then use DT to re-OCR. Otherwise save carbon dioxide :wink:

A word of warning: there is a known bug in the ABBYY-engine and/or DT3b1/2 which can lead to misinterpretation of rotation data; the page is then re-oriented, but some text is lost in the process (Information after an OCR scan is lost - half of the document is missing). You might want to wait for that particular bug to be dealt with prior to re-OCR-ing, depending on the documents involved.

2 Likes