Adjusting OCR Workflow

I am disappointed because my scanner (only 3 months old) apparently will not work directly with DevonThink and right now I simply don’t have the funds to replace it. I will explain my perception of the required work-around in the hopes that someone can give me a tip or two to shorten the process.

I currently have an HP Officejet J6450. To get a document into DevonThink I am scanning as a tiff file to a location on my hard drive and then going to that location and importing the file using “Import (Images with OCR).” It doesn’t seem to be all that bad, except if I don’t go through the process of going to the file in it’s location and renaming it at the time, then I have to take the time to actually open and look at the document at the time of import to give it an appropriate name. It really slows down the process if I have a significant number of documents to scan.

Also, this creates a problem when I have a multi-page document. How do I get multiple tiffs into one document?

I’m open to any suggestions that would help me streamline the procedure a little.


Doesn’t your HP scanner software have an option to save a multi-page PDF?

About document renaming: Many documents have a title in their text. If so, just view the document, select the title or appropriate text string, Control-click on the selection and choose the contextual menu option, “Set Title as”.

After hours of trial and error and adjusting and testing, I have reached the following conclusions.

  1. The ONLY way I can scan from the mac using my HP J6450 is using “Tiff to Preview” (single page tiff), and importing the resulting tiff via “File --> Import --> Images (with OCR…)” The resulting file is a PDF+Text and is 4.6 MB for a 1 page file (in color w/ no graphics). Any other method results in an OCR Error during the scanning stage.

  2. The alternative is to scan in Windows using the full blown feature set of the scanner. I can scan to a searcheable PDF (Uses the I.R.I.S engine built into the HP Software) which I then send to the mac where I import it into DT PRO and end up with a 1.1 MB file. That’s a more than 75% reduction in file size.

  3. I can scan using grayscale from either computer and get a negligible reduction in file size, but not nearly enough to be worth giving up the color.

I would dearly love to be able to scan directly into DT or even just keeping the entire process on the Mac, but it just doesn’t seem to be possible. Obviously, this is not a DT problem, but a Mac vs HP issue which will probably not be resolved any time soon. I’ll probably have to continue doing my scans in Windows Vista and shipping them to the Mac until I can come up with the funds for a new scanner.

Bottom line is I still get searchable pdfs in a fabulous database program and I couldn’t be happier with that side of the issue. It certainly isn’t the first time I’ve been very disappointed in HP… but if I’m smart it will probably be the last.