PDF conversion bug in 3.8.6: Text layer corrupted

Yes, the PDF looks fine after conversion, but when I copy marked text and then paste it e.g. into Notepad, I only get the gibberish you also saw earlier.

Try printing the page to PDF and see if you get the same results.

And do you see the same behavior in a new macOS user account?

Printing to PDF and Exporting as PDF from Safari gets me the same results. The behavior is the same in a new user account.

Thanks for checking.

Did some more testing. I get the same results when I capture a website straight to PDF on my iPad with DTTG3.

Yet more testing. Some websites are captured just fine, while others are not. With some toying around with Little Snitch, I’ve managed to narrow the issue down to what looks to be a server problem on the Google side of things. The websites that are affected by this connect to Google servers (fonts.gstatic.com and fonts.googleapis.com), and when I block access to these servers, the websites are captured just fine (albeit with a slightly different font).

2 Likes

That might indicate a problem with one or more google fonts. Out of curiosity: When you open your preferred browser’s developer tools on one of the websites where conversion fails, do you see any errors in the console? And which fonts are actually loaded?

Blocking Google Fonts might be a workaround, but it’s probably not a good idea in the long run.

Edit I found that question (without answer) regarding a similarly similar problem:

And this

3 Likes

I paid closer attention to some of my favorite (daily-read’s) web pages and where they used links to googleapis.com fonts, e.g.

  <link href='https://fonts.googleapis.com/css?family=Raleway:200,300,400,500,700,800' rel='stylesheet' type='text/css'>

I noticed that while the saved PDF was ok, copying text to the clipboard and the pasting out again was gobblygook.

Not an exhaustive test, but indicative of something going on with Google fonts? I’m not picking up other reports via searching the “interweb”, though. Time consuming to do more detailed checking. Hope this not a red-herring, but it does look to be not of DEVONthink’s doing.

I don’t have Little Snitch or other to block that Google font site.

1 Like

Yes, there are a few errors in the console log, all of which are for ads that are not loading properly, because I blocked access to these servers in Little Snitch. The conversion bug still happens when I turn off Little Snitch’s Network Filter though, so it’s not related to that.

Yes, I saw that as well when I opened the webarchives in BBEdit. You can’t remove that link though, because it breaks the file and BBEdit won’t let you save.