Underlining newspaper clippings

Athirne · October 4, 2025, 9:39pm

I discovered a most interesting way to find out if a clipping from newspapers.com has a text layer: try to underline text in the clipping. If you can’t do that, convert it to plain text and I’ll bet you won’t see anything except the header of the pdf file. Now, run OCR on it and try to underline again. I think you will be successful.

I’m aware that there probably are not many people in my situation: clipping articles from newspapers using newspapers.com. For those who do, though, be aware that if you select the pdf download option, DT may or may not create a text-layer upon import unless you have a Smart Rule to do this when a file hits the inbox. I have thousands of files and I’m creating annotation files for each one because OCRing these files is so hit-or-miss. This is not the fault of DT - this is an issue specific to clippings downloaded in pdf format from newspapers.com. My discovery does not apply to any other difficulty in underlining text in documents.