One of my longer term projects, has been to digitize a bunch of out-of-print history and genealogy books on my shelves. I’ve been putting this off after doing a lot of research on HOW to do this, because I don’t own a scanner, and didn’t want to mutilate the books themselves. I also had fingers crossed that a decades-old Open Source desktop app for the PC, called Scan Tailor, would become accessible for Mac users (that did happen, very recently).
I just completed my first couple of books, using a table clamp mount for an iPhone (I bought this one).
My issue, as expected, was in de-warping pages at the beginning and end of the book. I’m a bit of a perfectionist and tried, first, to manually de-warp with Photoshop. This works, but is painstakingly time-consuming.
I then tried a free app, much raved in a few online forums devoted to book scanning. HOLY SMOKES! For a free app, this one takes the cake. The app not only cropped each page, but deskewed and, above all, de-warped the necessary pages with mind-blowing results. Better than I was able to do manually in Photoshop.
My workflow is to:
- Take pictures of each page with the three-second timer mode, on iPhone. This gives me time to use my hands to flatten each page.
- Launch vFlat for the initial cropping and de-skewing. I set the output to full color.
- In Photos app, on my laptop, I then export all this stuff without modification (export as original).
- In Lightroom, I do a batch conversion to B&W, change contrast, shadow etc.
- Export from Lightroom as 300 DPI TIFF images, to a folder on the Desktop.
- Import these images into Scan Tailor Advanced, which runs on Mac and has many uses for everyday OCRing, too.
- Begin Scan Tailor at the content discovery phase, then do the remaining phases (set margins, and finally, the output iteration).
Scan Tailor outputs the lot, magically, as tiny files for each page. It converts the background to pure white, makes my fingers disappear, blackens and smoothens text for optimal OCR, etc.
Finally, I bring all those tiny TIFF files into DT to (a) convert to PDF, (b) merge to single PDF, then, finally, OCR at 300 DPI resolution, which I set in DT preferences.
I’m very happy with the result. Worst part of this process, was standing at the side of my table. Consider instead setting this contraption up on a coffee table, so you can sit while doing all the picture taking. Otherwise, be ready for neck pain!