All these years I’ve been using DTPO’s built-in OCR technology… and wasting tons of hard drive space.
I scanned a three-page document using DTPO’s built-in OCR, and ScanSnap’s OCR. The difference?
DTPO: 4.7 MB
ScanSnap: 431.2 KB
and scansnap seems to have recognized words better, too - if I search for a word that shows up under a stamp, it finds the scansnap document but not the DTPO one.
Without commenting on the settings of the OP I have to say that the standalone ABBYY Finereader program does a MUCH better job than the built-in DT OCR engine (in terms of file size, not quality). The file size difference is sometimes as great as 80%. I’ve always been pleased with the quality of the DT OCR engine, but for those with hard drive space concerns you may want to look into the standalone program.
sorry I meant I already have a non-searchable PDF and want to use scansnap to OCR it, because it yields much smaller files than DTPO (as mentioned in the initial post)
I don’t see a way where I can load a PDF through scansnap… it only does its thing when scanning a paper document.
Unfortunately, the only way to get the ABBYY that comes with the Scansnap to OCR a document would be to print it and scan it again. The engine that comes with the Scansnap is limited to only OCR documents that come from the scanner, to induce you to purchase the full Finereader product if you want to OCR general documents.