One can capture a Web page into DEVONthink as PDF, WebArchive, HTML, rich text or plain text.
PDF and WebArchive captures most closely approximate the layout of a Web page. HTML captures the layout, but not images. RTF only approximately does that (sometime well, sometimes badly, depending on the page layout), and plain text captures only text content.
Almost always, my interest is a specific article, perhaps with associated images, on a Web page. I want to “freeze” that information in my database so that, whether or not the source page disappears later, and whether or not I’m online, it is now contained in my database. WebArchive (if the contained images are inline), PDF and RTF captures can do that, but HTML or plain text captures cannot do that.
Another consideration is that once in my database, I may want to extract text from the document, or add links or notes to it. If I made a PDF capture, extracted (copied) text has hard returns at line endings, and I hate that. I cannot link from a PDF document, and notes added are only in plain text and may not be searchable. So PDF won’t be my favored format. Although I can copy text from a WebArchive without problems, I can’t add links or write on it (I’m not about to do source code editing).
Still another consideration is that a PDF or WebArchive capture is likely to contain content that I don’t want or need, such as ads or unrelated text and images.
Only the RTF(D) format survives those considerations for a capture that permanently holds what I’m interested in, allows easy extraction of text that doesn’t have to be edited to remove hard returns at line endings and is easily editable to add my own hyperlinks and notes, and doesn’t contain extraneous content.
So, about 99% of the time, I capture information from Web pages as rich text.
Now that I’ve got rich text, I can do anything I wish with it. I can drop it into various word processors (with or without images), turn it into PDF, Word, HTML, WebArchive or whatever. So this is as “universal” as I need it to be.