Hi,
Most pdf files I have no problem highlighting. But some “photo” pdf’s I can’t. (The entire page goes blue as if I’ve selected it.) I’ve tried Preview, Adobe Reader, calibre, ReadOris (OCR), and DTP. All I can do is enter a note. This isn’t good enough. How can I convert “photo” pdf to a normal pdf so I can highlight it? Thanks in advance
What is the content of your “‘photo’ PDF”? An image without text, or an image with text? Do you run OCR on these PDFs in DEVONthink or elsewhere?
Highlighting is a PDF tool that selects text that is present on the text layer that is created by the OCR process – and the quality of the highlighting and the text layer depend on the quality of the original PDF and the software you’re using. Without a specific example of your work, it is difficult to suggest a solution – but you can always use the PDF tool that selects and fill a region of the page (I.e., the figure tool). However that tool doesn’t select text.
Hi, Please see attached screen shot of downloaded article first page. I’m trying to highlight a line of text.
I located this document, downloaded it, OCR’d it with DEVONthink (it was not PDF+Text to begin with) and opened it in Preview. I was able to select and highlight text. It’s a little tricky on some pages, but I was successful on each page. For example:
Hi,
You wrote, “OCR’d it with DEVONthink” how did you do this? Could you post directions? I can’t find it in the View/PDF Display menu. Is there another area I should look? Thanks in advance
You mentioned, above, that you use “DTP” – that usually means DEVONthink Personal or DEVONthink Pro. The third product, DEVONthink Pro Office has the additional capability of OCRing PDFs. So, if you do not have DEVONthink Pro Office, you would need to use a third-party app such as PDF Pen Pro or Adobe, etc., for OCR. If you do have DEVONthink Pro Office you would use Data > Convert > To Searchable PDF to tell DEVONthink to OCR the PDF. In any case, when DEVONthink shows “PDF + Text” as the “Kind” of a document, then it means that PDF has been OCR’d. If it merely shows “PDF” then it has not been OCR’d.
OK. Now I get it. I have DT Pro without the OCR option. You wrote, “a third-party app such as PDF Pen Pro or Adobe, etc., for OCR.” Which Adobe app are you referring to? I already have the free versions of Adobe Reader DC, Digital Editions, and Reader. If there are other apps that convert pdf’s to “pdf + text” please advise.
Acrobat Pro.
For the most part, if you hear the term “OCR”, there’s a dollar sign (euro, yen, etc.) attached to it.
I was thinking of Adobe Acrobat DC, the professional version of Acrobat. Also, probably the most expensive alternative. I’d suggest getting a trial of PDF Pen or PDF Pen Pro from SmileSoftware.com and seeing if that meets your needs. If you just need OCR, then Acrobat is overkill.
You said you had ReadIris (OCR) which sounds like it should be able to convert the pdf from a pdf to a pdf and text layer.
Frederiko
Hi, I checked out PDFPen and PDFPenPro which retail for $75 and $125. Thanks for the information. I’ll do the trial to see what happens. But on first impression, I’d more likely spend that sort of money for DT Pro Office to get a lot more functionality for my money. Just saying. Thanks again.
Hi, After looking at ReadIris v11 which is too old and comparing a scan of the article with PDFpen the later is superior by far. It appears to me PDFpen is easier to use. But will the DTP Office upgrade with OCR be a better value? I currently use Preview all the time.
I’d say, yes. You are not buying just OCR capabilities when you upgrade to DEVONthink Pro Office, so I’d say the upgrade would be money better spent.
Just my opinion (and I’m obviously not in Sales )
Thanks for all the feedback.
I use DEVONthink Pro Office but rarely for OCR. I don’t like results from the OCR engine that DEVONthink uses – AABBY – for the work I need OCRd. I prefer Acrobat for that. Not all OCR tools are the same – the size of the resulting file, the resolution, quality of text recognition – all these factors vary depending on the material you are OCRing, the tool you use, and the settings for resolution and other factors.
I suggest testing a good sample of your files with several OCR apps, including DEVONthink, and deciding which software fits your specific needs. On the other hand, if you only need OCR occasionally then get the cheapest you can find. It’s only worth investing in OCR if you need it very frequently.
Hi, I listened to Mac Power Users group and this thread and went ahead and paid for the upgrade to DTP Office. Nice to get the summer fest discount. I don’t need to digitally sign pdfs or edit them much more than highlight and annotation. Is there away to improve resolution on an OCR copy? E.g. after the OCR scan the MacIntyre scan from above is about 75% quality. Thanks in advance
DEVONthink > Preferences > OCR
You cannot improve an already-created OCR, but DEVONthink will leave the original in place unless you have chosen the preference to “Move [original] to Trash”.
Ok. Thanks for answering my questions. Since I live in Taiwan I’m not able to go to the local shop for information.
Hi, 12/6/16
Since this connects with the above topic I thought I would post here.
I have a pdf “photo” copy of a book that is blurry, or out of focus. Please see attached file.
Is there a way to use DTPO/OCR to focus the text or rescan it at a higher quality for reading w/o losing my annotations? The pdf has the same problem in Preview. Its affecting my eye focus.
TIA
The image cannot be fixed by the OCR process. You would need to get a copy of the book, plus perhaps a better quality scanner, and make a new scanned copy.