Cannot highlight certain pdf documents

I find that there are certain pdf documents (academic journal publications) that I cannot highlight. DT3 just beeps and the highlight option is greyed out. If I re-download the pdfs from the journal website, I usually can highlight them, but not always. I have enough of these that I don’t want to have to redownload them all the time (they are all named and tagged already). I have not been able to figure out what it is about the pdfs that is causing them to not work. I can send a copy of a pdf that doesn’t highlight to Christian. Is there a property that might be getting changed that might affect highlighting? The problem persists when I restart DT3 and restart my computer. Only re-downloading a fresh copy seems to have an effect.

Thanks in advance.

I have the same issue. But don’t know what is in common for those PDFs cannot be highlighted.

Some PDF documents can’t be modified due to proprietary Skim annotations. Such annotations are only viewable and searchable. Others are marked as read-only (see navigation bar) as the PDFkit might corrupt the text layer after editing.

Do the PDFs contain non-Latin characters, like Asian languages?

No, they don’t

I don’t think they are read-only as they can be highlighted in PDF Expert. Just cannot be highlighted using Devonthink’s build-in viewer.

PDF Expert doesn’t use Apple’s PDFKit.

I have recently moved to highlighting in DT3 after using skim for many years, so I have many hundreds of pdfs with skim markings. Is there any solution for this, or workaround, that you know of, or do I just need to redownload all of those files if I want to truly make the switch to working only in DT3? Would it be possible to write any kind of script to make the pdfs compatible? What if I was willing to lose the skim markings (or store a copy of them somewhere else), but didn’t want to re-download all of the files? Could I just delete the .skim files?

Thanks

DEVONthink supports Skim annotations stored in .pdfd files and stored in .pdf files via extended attributes. In this case the documents are opened read-only though. Converting them to a new PDF (see Data > Convert) should create editable documents.

AFAIK This has nothing to do with skim. I do know that DEVONthink uses pdfkit in a way that can scramble the text layer of some pdfs, (in my case these are notably ones that have been created by conversion in Calibre). I guess DT’s answer to the problem has been to turn off PDF annotation within DT for those PDFs. Better than ruining the pdf’s I suppose, but hardly satisfying, given that even Preview can handle these documents properly. It may be fair to blame PDFKit, but that doesn’t explain why DEVONthink seems to be the only reader that causes this scrambling.
I for one would like a fuller explanation.
Thanks,
Eiron

I am also seeing similar behavior, where the some PDF documents can be highlighted in Apple Preview, but the corresponding highlight controls are disabled in DEVONthink 3 for the same file.

It seems to depend on the specific PDF file, as there are files which I can highlight in DT3. Not sure what is going on (no asian characters in the impacted files either).

I am not using any 3rd party apps for the annotations (e.g., Skim or PDFExpert).

This is on DEVONthink 3.8 (latest) and Big Sur 11.6 (M1 Macbook Pro, if it matters).

1 Like

This might be the issue (for me, at least): the impacted files are encrypted (PDF eBooks).

However, Preview still allows me to annotate and highlight the content on these. See the screenshot from Preview with the Inspector open, and a sample highlighted phrase:

The highlighted entries do show up in the DT3 annotations pane:

Screen Shot 2021-10-17 at 8.51.28 PM

For me, at any rate, the source files are definitely not encrypted, so no explanation there, I’m afraid.

PDF files which might be corrupted are automatically opened by DEVONthink as read-only. The next release will add a hidden preference to disable this.

I have had similar problems and can often resolve them by using OCR to make a searchable pdf. Works most of the time.
Don

2 Likes

@dspady I just want to second your suggestion to use the menu item Data/OCR/to searchable PDF. I run into this “non-highlightable” issue fairly often, and the OCR almost always works for me too.

I have seen a similar problem and almost every time they were “protected” PDF files that cannot be altered or highlighted in any way. Some legal depositions are PDFA files.

Try to highlight and/or print with any other PDF app. If the highlighting feature doesn’t work anywhere, then you have a protected file.

Larry

1 Like

I have a similar problem highlighting in DT3 from time to time, which Bluefrog kindly found a workaround for - convert to PDF (Paginated): go to “Data” → “Convert” → “to PDF (Paginated)”.

It creates a new PDF (so I have to delete the old one) but the new PDF can be highlighted OK.

I don’t know if your issue is the same, but maybe worth a try if OCRing doesn’t solve it.

3 Likes

Same problem here: directly imported PDF via “Print to Inbox” cannot be highlighted. Have to "print "them again to inbox in order to highlight. @BLUEFROG , what can we do please?

1 Like

Which application and which macOS/iOS version did you use to print to the inbox?