Configuring Devonthink Pro 3 with Scansnap Home / ix1500

  • The results should be comparable between ScanSnap and DEVONthink’s OCR output.
  • ScanSnap uses the ABBYY FineReader OCR engine as well. The separate application isn’t required but it’s still ABBYY’s OCR.

Thank you very much for your quick answers!

You’re welcome :slight_smile:

Hi Bluefrog,

I have now had the opportunity to do some comparative testing and DevonThink’s OCR performed better than SnapScan. Smaller fonts were recognised and there were noticeably fewer errors. For example, ScanSnap’s OCR consistently misrecognized certain names that were actually easily recognisable in different parts of the document. The reason for this is not clear to me. So I would like to use OCR from DevonThink.
However, there is a problem with the DT settings: ScanSnap scans in different resolutions up to 1200dpi, depending on the document. DevonThink, on the other hand, has a resolution setting under OCR which is 150dpi by default and can be increased to a maximum of 300dpi. So I will lose image information if I use DevonThink’s OCR because the resolution is capped at 300dpi. It looks like this DPI limitation cannot be turned off, which is a bit unfortunate. Is there any known solution for bypassing this limitation, or is this just how it is when I convert to searchable PDF in DT? Also, does this 300dpi limit generally apply in DevonThink, so would it also come into play if I use ScanSnap’s OCR (no conversion to searchable PDF in DT)?

There is very rarely a need to scan higher than 300dpi and even then it’s usually one for things like blueprints. The higher resolution also leads much larger file sizes without providing a proportionate increase in accuracy.

And the cap in DEVONthink’s OCR is related to the image layer in the resulting PDF.

Thanks for the info. I have now set the image quality level in the Scansnap profile to “Best”, which results in 300dpi for black and white scans. That should be decent for an archive.

What I have found with a bit of tinkering is that when the file format in the Scansnap profile is set to jpeg instead of pdf (i.e. JPEGs are created), the quality of the documents in Devonthink improves significantly (scans look more delicate and resolution is better), file sizes are slightly larger, but most noticeably, the quality of the OCR also improves significantly, it recognizes even the smallest fonts. When I save as PDF, certain words are not recognised, but with JPEG they are. This seems to me to be the best setting for a paper document archive; ix1600 scan quality “Best”, save as JPEG, OCR in Devonthink with 300dpi.

1 Like

Yes, I’d agree with your findings. However, for multi-page documents you can only use PDF if you want to scan all documents into one file efficiently. For single page scanning, I also think a raster format is a better option.

I see, that would be a big practical limitation; it looks like there is no other option than PDF then.

It would depend on the specific situation and need. :slight_smile:

Why did you have to upgrade the Scansnap s1500? Mine is working fine with BigSur, just as it did with Catalina.

As usual this thread saved the day. Decided to do total fresh install of everything on my new Mac and had forgotten how to get Devonthink and ScanSnap Home set up. Thanks to all once again.

And yes, I do realize this is an old thread!

2 Likes

Glad to hear it!
There’s also the Help > Tutorials > Scan with a ScanSnap tutorial as well :slight_smile:

Hi, I am new to both Devonthink as well as to ScanSnap Scanners. I am running DT3.9.4 and have it set up according to above instructions (thanks, pretty straight forward and precise) with my new iX1600. Everything works fine, but somehow, the (old) scans are being saved in my OneDrive folder. Since I have set DT to delete the original files after OCR, I assume this is caused by ScanSnap. In the ScanSnap profile for DT, I have chosen a directory I created within my OneDrive folder to save the scans, so basically what lionbreath did back in 2020 above. But the scans always get to the OneDrive root folder. Anybody with a similar issue? Any help would be greatly appreciated. And yes, I also contacted ScanSnap support and am waiting for their reply.

I’m sorry not to be able to offer anything substantive to the new question being asked, but I thought I would offer some crumbs of comfort to those with an older ScanSnap thinking that they’ll have to buy a new one to keep up with modern MacOSes.

Of course, Fujitsu would like you to think that, but it’s not actually true.

If you ignore the fancy new ScanSnap Home and instead just download ScanSnap Manager instead, you’ll find there’s a good chance it still works, at least it does with my ScanSnap S510M, which I bought in about 2007, and that’s as good a test as any :slight_smile: Obviously, I don’t know how long it will continue to last for, but for the time being it’s working on Sonoma 14.1 and a Mac Studio M2.

2 Likes

@Kilgore: Martin: I have this same issue. I have DEVONthink set to delete the original file after completing its OCR, but it does not seem to work the way I expect. The original file created by ScanSnap continues to hang around after DEVONthink has completed its OCR and saved the new file to my Inbox. @BLUEFROG: Jim: Any suggestions of what to try?

It continues to hang around… where? In DEVONthink? In the Finder? …?

1 Like

Well, Hmmm… I made a change in the ScanSnap settings, and that may have fixed the problem: In the profile I had created in ScanSnap Home for sending my scanned files to DEVONthink, the Save To location had previously been set to the top-level of my Documents folder. I re-reviewed DEVONthink’s tutorial, and noted that it had cautioned against changing this location, so I reset the location to the original default location, which was “ScanSnap Home” folder in my Documents folder. After making this change, a test scan seemed to work as I expected: The scanned file was received by DEVONthink, OCR was performed, the file was saved to the DT inbox, and the original file created by ScanSnap appeared in my Trash. So, perhaps this will help someone else who might have run into the same issue. I am running DEVONthink 3.9.4 on Mac OS Sonoma 14.2.1, and ScanSnap Home version 2.20.0 (9).

Glad you reread it! And indeed, there is no compelling reason I can think of to change the location if you’re just going to trash the unOCR’d original afterwards.

1 Like