Missing characters upon import from web page

Hi all,

Occasionally, random characters are missing in text imported from a web page, regardless whether I choose plain or rich text. The source of the web file shows the respective words intact.

Screen capture, missing characters marked (56KB):
http://24fps.net/stuff//charsawol.pdf

Screen capture of the source code (44KB):
http://24fps.net/stuff//charsource.pdf

Original page:
http://www.newyorker.com/printable/?critics/040216crbo_books

Help greatly appreciated,

Lambert

Lambert:

I did a rich text capture of the New Yorker page, without any missing characters.

The capture was done using DT Pro 1.9 alpha31.

Sorry, I’ve not encountered your problem and can’t offer suggestions.

Strange. How exactly do you capture the text? Via our Services menu commands? Or by loading the page within DEVONthink? And which version of DEVONthink are you using?

I tried here and it worked perfectly – but I’m using DEVONthink Pro 1.9 :wink:

Best,

Eric.

Yes.

1.8.1b PE. However, I no longer think Devonthink is at fault here. I’ve copied and pasted the same text from Safari to Textedit, and there were still random characters missing, others than in the example, but in the same paragraph. When I copied the source and stripped the markup, everything was okay, though. That seems to narrow it down to the Safari Webkit and what gets copied when the contents of a rendered page are selected. On second thought, I do have software installed that hooks into Safari (Pithhelmet), so I’ll see if removing that solves the problem.

Sorry for not having checked all angles before posting. But then again thanks for confirming that the problem isn’t rooted in the web site.

Best,

Lambert