Document not captured correctly

I am curious why my captured PDF is showing ASCII for the Translate and lookup options in the menu.

A PDF isn’t a text file so what you see and the underlying code aren’t necessarily the same.

Select the PDF and choose Data > Convert > to Plain Text and inspect the text file.

How did you capture this PDF and from what URL?

I captured it using your plug-in for FireFox. The URL is here (The Historical Unity of Russians and Ukrainians - Modern Diplomacy) Converted to plain text can be seen in screen grab.

1 Like

This seems to be an issue of the Safari/WebKit engine and/or the PDFkit. A PDF exported or printed from Safari has the same broken text layer.

OK. Is there a known work around? Do you guys report it to Apple/WebKit folks?

We did in the past, none of the reported issues breaking the text layer was ever fixed.

So does this mean that PDF’s captured on the web on a Mac are not searchable? I guess I am trying to understand the impact of this.

Usually they are but not in case of websites using certain languages and/or fonts.