OCR Google Books clippings

Thanks to Google Scholar and Google Books, the tedious job of getting your sources together has become sooo much easier. But importing material from Google Books is still a problem

After some adjusting: I use this workflows:

  • In GoogleBooks, I took quick screengrabs of anything interesting. I pasted the pictures into an open RTF document (in textEdit) and added comments.

  • Then, I printed the text document as a pdf file to DTPO with the script.

  • In DTOffice, I converted the PDF to a readable pdf. The Pictures from GoogleBooks should now be searchable and appear in see also results.

But there was a problem: A text clipping from GoogleBooks is not converted into readable text, when the resolution is low. So it only works with a high zoom level in Google Books

Is there a certain resolution-“threshold” for the ocr to work ?

Anyway, with a high zoom level it works as desired, although I have to take more screenshots to get a good OCR result. Still, it is convenient to have clippings with comments.

A screen grab will have a resolution of between 72 and 100 dpi, depending on your monitor’s resolution.

But the recommended resolution of an image for OCR is 300 dpi.

That’s why taking a screenshot of a zoomed Google Books image can give better OCR accuracy.