Questions about adding newspaper scans to Devonthink Pro Ofc

I recently purchased Devonthink and I subscribe to a historical newspaper service that allows me to download copies of each page of a newspaper. I would like to download a copy of each issue of my town’s local newspaper from the 1880’s to about 1980. I have several questions and would appreciate your advice:

  • As I download the .jpg images I assign a name to each corresponding newspaper page in this format: Newspapername_YYYY_MM_DD_pg1. Then I add the pages to a file folder named NewspaperName_YYYY_MM_DD. Is there a drawback as far as Devonthink is concerned to this method? Would it be better to add the pages to a .pdf document named Newspapername_YYYY_MM_DD without bothering to name each individual .jpg page?

  • The downloaded .jpg of each page is enormous at 4819 x 7549 6.6MB 72 pixels per inch. Can you give any guidance as to far I can reduce the image size?

  • Currently my Devonthink OCR doesn’t recognize the newspaper columns, instead it reads left to right across all columns. Is there a setting to force Devonthink OCR to recognize that this document is a newspaper and it should recognize the columns?

  • Do you have any other tips for using Devonthink with backissues of a newspaper?

Yes, those images will be large. But if you make them smaller, you will lose resolution. Newspaper pages are big. :slight_smile:

If you want to select the readable text of a document that has columns, hold down the Option key while drawing a ‘box’ over the portion of a column that you wish to capture to the clipboard, then press Command-C to copy it to the clipboard. Paste the clipboard to a text document, and you will find that the capture was restricted to a single column. The OCR worked.