Document size dpi properties

Hi

Could anyone explain to me why, in the Information Panel, the ‘Size’ properties for all of my documents display “72 dpi”? This appears also in the title bar.

I’m scanning and OCR’ing them at a higher resolution than that.

Thanks
Nick

What Image DPI is displayed under the General Info tab of Preview’s Tools > Inspector window? Example:

General Info Inspector.png

Hi sjk

Thanks for your response. I should have pointed out that this is with PDF+Text documents. An example is attached.

Regards
Nick
General Info.png

Can you open that PDF in Preview and check the Image DPI in the Inspector window?

Inspector doesn’t appear to show any of the image properties in the PDF. Any idea how to view them?
General Info.png

Seems to be excluded from scans to PDF. I’m not sure how DEVONthink is determining the dpi value it displays or what to suggest as another method of finding it.

The dpi is a bit useless here, you can have a 300dpi scan rerendered as 72dpi but use a 300/72 times larger image area to achieve the same result. After OCR the [b][i]effective[i][b] dpi of your OCRed images is what you specified in the OCR preferences pane.

Hi Annard

The scans I have are being scanned as “Best” by a Fujitsu ScanSnap S510M and the OCR setting in DTPO is 300 dpi. Just zooming in and looking at the scan shows it’s well above 300dpi. Therefore, why is 72 dpi being show on every scan? Also, I’ve noticed that 72 dpi is displayed on PDFs that I have downloaded and saved from the internet.

To be clear, if I scan at 300dpi and OCR in DT at 150 dpi, is the resulting document is re-saved at 150 dpi and therefore reduced in size? I’m just wondering if this is a way for me to reduce the size of documents I don’t need at such a high resolution.

Regards
Nick

Yes, the reason we added the image properties for Searchable PDFs is because people were telling us they wanted to reduce their document sizes. That’s exactly what is does. The factory defaults are good enough for print and online reading yet are not as large as the originals (except when the original is BW).

Thanks, that’s great. I was scanning receipts at 300dpi because some were too small to be readable. However, I’ve accidentally scanned letters at this resolution, which is too high.

Is this “72 dpi” showing in the Information Panel therefore a bug?

Regards
Nick

No, it’s not a bug, nor does it really have any important relationship to scan resolution.

Think of it as tied to the convention of the resolution of images that are displayed onscreen, nominally 72 dpi although the actual resolution displayed by many Macs is greater than 72 dpi.

Suppose you have a 10 megapixel photo displayed in a photo editing application. By convention, if you choose a ‘show actual pixels’ view the image will be huge, nominally 72 pixels to the inch horizontally and vertically. But the quality of that photo at 72 dpi isn’t great and of course it wouldn’t fit on an 8 x 10 inch printout. So for printing one wants to increase the displayed (printed) image to 300 dpi or greater so that it looks much sharper and fits into the desired size. The image still has the same number of total pixels, but the higher the display (or print) resolution, the more closely packed together are the pixels.

I do most scans at 300 dpi or better so that OCR has a good image to work with, for good accuracy of conversion of imaged text to converted text. A scan made at 300 dpi will have a great many more pixels than one made at 72 dpi and will look much sharper to the image analysis/recognition algorithms that examine the images of characters and produce computer-readable text. Generally, a scan made at 72 dpi wouldn’t have enough information content to allow the OCR software to make fine distinctions among the shapes of the characters in the image; OCR accuracy would suffer.

Hi Bill

Thanks for your comprehensive response. In that case, what is the point of displaying “72 dpi”? This is confusing, especially as it appears in the window’s title.

Could you not get it to display the real DPI, e.g. the OCR resolution set in DTP’s preferences? This would be so much more useful. If no, then removing it would be preferable.

Kind regards
Nick

Nick, it took me really a long time and many explanations to sort out the DPI issues early in my Mac life, so I know kind of what you’re thinking.

I think the best way is to try to separate the nominal screen resolution of 72dpi out of the entire question, since the only time it’s useful is when you the user is interacting with the image as it’s being displayed on the screen.

If you now remove yourself from the scene, other applications can and do use much higher resolutions in their internal operations, operations designed to produce a certain output. In case of OCR, we humans don’t really care if the display on the screen is at 72 or 300 or whatever, as long as it’s readable to us. An OCR program, though, doesn’t have our analytic brain power to generally tell the difference between a c and a ç at 72dpi. The OCR program might think the cedilla is a speck of dust or background noise and inaccurately render the ç as a c. The OCR application generally gets more accurate as the dpi increases to 300 or above. Of course, the more dpi, the more accurately dust or other artifacts on the scanned image are reproduced and that can affect the quality of the OCR.

Or put another way: Our poor old eyes (mine at least! :laughing: ) have a hard time distinguishing small resolution differences between an image of 72 and one of 300; yet our eyes and brain can “fill in” and corect wrds that are mispeled or nt qite wat they sould be on teh print’d paage, whereas even Abby might have trouble with the foregoing.

Hi twicks

Thanks for your response.

However, I think think that it would be useful for a document management system to display the resolution at which documents were scanned into PDFs. Especially as a scanned and OCR’d PDF is likely to be the main use of DT.

I’d really like to know whether a particular document has been scanned at 150, 300 or 600 dpi and I have no interest in being reminded that “72 dpi” is the standard monitor display resolution, especially since that’s not the case for my monitor…

Regards
Nick

Yeah, the “72 dpi” is confusing/misleading/useless to me. When I open JPEG scans in Preview (done with VueScan) the Image DPI is more meaningful:

General Info.png
If I add that image to DTP it retains “300 dpi” in the Size field of the Information panel; thankfully consistent.

Interpreting certain monitor/printer/camera/scanner dpi values can still make my head spin. And how dots vs. pixels per inch is sometimes used interchangeably doesn’t help.

Nick, the searchable PDF stored in your database does not ‘know’ the resolution at which it had been scanned prior to OCR. That’s because the image was created anew during OCR, at the resolution you specified in Preferences> OCR. By default, as Annard noted, that resolution is 150 dpi.

The important information provided by the Size data for an image is the number of pixels. That’s why digital camera specifications usually list the number of pixels in an image. The lower the resolution at which an image is displayed, the larger it will appear.

Bottom line: the database cannot provide direct information about the resolution at which a searchable PDF had been scanned.

Thanks. That’s that then!
Nick

Works for me, too. Thanks for the explanation, Bill.