No, your conclusion that DTPO is converting or storing PDFs at 72 dpi isn’t correct, as noted by Christian. That reference to 72 dpi has nothing to do with capture or storage resolution or image quality; as it is easily misinterpreted, Christian is thinking about removing the reference.
If I wish to capture a PDF file that’s presented by a Web site, I will use my browser’s File > ‘Save As’ command, and choose ‘Inbox’ as the destination. Now I’ve got the actual PDF file in my Global Inbox. It will have the same resolution and image quality as the PDF put up by the Web site. If the PDF is searchable, it will not require OCR.
When you Import a PDF into DTPO it isn’t changed in any way, with the exception of PDFs that are subjected to OCR. For your need to include PDFs in your professional reports for printing, this is an important issue for you when you are setting up DTPO Preferences > OCR.
When the OCR module converts an image received from your scanner, or an image-only PDF downloaded from the Web, it rasterizes the original image. The tools built-in to OS X for that purpose are not very efficient for the size of the resulting new image layer of the searchable PDF, so it can balloon substantially in storage size.
That’s why Preferences > OCR allows the user to configure the resolution (dpi) and the quality of images in the PDF (quality %) as a compromise between the view/print quality of the stored searchable PDF, and the file size. The lower the settings for dpi and quality, the smaller the file size, but with increasing degradation of the view/print display of the PDF.
The default settings of Preferences > OCR are 150 dpi and 50% image quality. When I scan paper copy with my ScanSnap scanner, I’m scanning at a higher resolution and image quality, as OCR requires a scan image of 300 dpi and high quality (uncompressed) images for good OCR accuracy. After OCR has been done, the Preferences > OCR settings then save the searchable PDF at lower resolution and image quality to save disk space on my computer.
I’m satisfied with the default Preferences > OCR settings for most of the documents that I routinely scan to a database, such as receipts, invoices, contracts, letters and so forth that I want to keep in a database for personal use. The resulting searchable PDFs approximate FAX quality for viewing and printing, and that’s good enough for my purposes for those documents. I have a lot of such paperwork coming into my home, which I want to keep for various purposes, including tax records, etc. The advantage of storing them into my databases is that I can actually find information when I need it, much more easily than I could find the paper copy — and I don’t end up with hundreds of pounds of paper in file boxes and file cabinets. Although I’ve got terabytes of disk storage space, I see no need to save all searchable PDFs with high view/print quality.
When you need higher quality OCR results: In a case such as yours, you don’t want significant degradation of the view/print quality of a PDF that you intend to include in a print publication.
Beginning with DTPO public beta 8, there’s a simple alternative to ‘tweaking’ the Preferences > OCR dpi and image quality settings upwards. There’s now a check box in the Images section, ‘Same as scan’. When that box is checked, the quality of the searchable PDF resulting from OCR will approximate the quality of the original scanner output. That should work for your print publication needs, assuming your scanner was producing acceptable results.
Another alternative that you had thought of would also work. You could Index-capture a PDF that you planed to use in a print publication. If, within your database you then OCR the resulting PDF document using Data > Convert > to Searchable PDF, the result would be that the original PDF (external to the database) would be untouched (remain image-only) and retain its full view/print quality, while the PDF in your database would be Imported into the database with the resolution/quality settings in Preferences > OCR. This approach would allow you to keep a copy of the PDF for use in a print publication and also place a searchable copy of it into a DTPO database.
Comment: Many PDFs from Web and other sources are searchable; there’s no need to run a PDF through OCR if it is already searchable.