Automatic OCR with sorter

I am a relative newby. I have used DTPO for just over a year testing to see if it would work for us. The more I use it and become comfortable with what it can do the more I like it. I do have a couple questions.

When I drag and drop pdf files into the sorter they will import into the global inbox, that’s fine. It does not run OCR on those files unless I go back and select them and manually convert them to searchable pdf. Is there anyway for a computer novice like me to set it so that anything I drag to the sorter will automatically go through OCR?

The files I scan with my ScanSnap are automatically run through OCR and that works great.

I apologize if this has been covered somewhere else, I have searched high and low through the forum and tried I think every possible option in the program itself.

Thank You,
Ray Brunker

There is no such setting, as it would not be desirable (generally,most files sent to the Sorter don’t require OCR, including PDFs that are already searchable).

However, if you have captured PDFs that require OCR they can easily be identified by their filetype, as image-only PDFs have the filetype “PDF” and searchable PDFs have the filetype “PDF+Text” in DEVONthink.

A smart group that lists all documents with the filetype “PDF” can be used to identify candidates for OCR. Image-only PDFs can be OCRed by selecting them and invoking the menu command Data > Convert > to searchable PDF.

You may discover some image-only PDFs that would not benefit from OCR, such as photos or PDF that contain handwriting only.

Thank you for the reply. I was just hoping to make things easier.

Have a good day,
Ray

I’m curious about why you would want to have everything you drag to the Sorter be OCRed.

Do you work exclusively with PDFs? That would be rather unusual. Although PDFs are great for sharing data across multiple operating system platforms, they are not a very file size efficient way of storing textual information and they are difficult to edit.

Most PDFs I download from the Web are already searchable, and running them through OCR would only result in an increase in file size and some degradation of their view/print quality. That’s especially true of PDFs that result from scanning and OCR, as a second generation of OCR would (also) likely result in more text conversion errors.

If you have a collection of PDFs in the Finder that do require OCR, there’s a menu command to capture them to DEVONthink Pro Office with OCR – File > import > Images (with OCR)…

We are probably using DT in a little different manner than most. As I said we farm for a living. Every year we generate two big boxes of receipts and invoices for our accounting.

We use Quick Books for our actual accounting and have started scanning everything instead of filing the paper copy. I have a database for each calendar year set up as well as one for permanent type records such as land leases etc that cross several years. I started scanning things last year naming each file and tagging things (as if I was putting them in a file drawer) but I finally realized just how good the search function was and just started saving the pdf files with the default file names and relying on the search to find things again.

Now I scan every single purchase order, invoice, grain ticket, truck log etc. The more I use it the more I like it and the more confident I have become in scanning a document and shredding the original. With the ScanSnap it is so easy.

The things I am normally adding through the sorter are pdf files that I have gotten from our attorney or something similar that I want to file with this years accounting or in our permanent file database. It’s not a big item but I was just looking for a way to make it easier. We are 1/3 of the way through our year and our file size is about 600mb so file size is not as important to us as it might be for someone with a huge file.

I don’t try to save many websites or things like some folks might do, I am just using DT as an electronic filing cabinet. I am sure I have only just touched the surface of what DT can do, but I am not a computer guru.

My wife has recently started another database with recipes that she has saved. Some she scans in and some she prints a pdf file from the internet and then saves the pdf to DT instead of printing it on paper. She is very pleased with it. My daughter is getting married this summer and I am going to get her a copy of DT for her own use, she just graduated with her elementary teaching degree. I think she can use DT to save a lot of information and be able to find it again easily. My wife is going to give her a copy of all of her recipe’s as well as our old family recipes to give her a start on her own recipe book. My first wife passed away when my daughter was just 13 so she grew up without a Mom showing her how to cook etc. She has taught herself how to cook in college. Her future Mother in law is a very good cook and I think DT will be a great way for my daughter to have her own family cookbook.

I realize that DT can do so much more than what we are using it for but it is working very well for us. I would hate to do without it.

Have a good evening,
Ray

Thanks, Ray. I had super good fried catfish at Brownie’s Family Restaurant at Bean Blossom, Indiana. As I’m feeling well fed, I will have a good evening. :slight_smile:

We’re glad you are enjoying the application. Feel free to send us messages at Support if you run into any problem.