We clearly have quite different needs. Bill is absolutely right that if you want to use the captured materials, e.g. for copy and paste operations, RTF is much better than pdf; in the latter format, copying from multi-column layouts can be a nightmare. I am overwhelmingly capturing subatomic-physics related materials, and in general not for further processing, other than reading and understanding it, and available for future searches. In other fields, the needs are different, i.e. someone else might have to heavily quote from the collected materials; then pdf is not the best.
Originally, I was also very much against any post-processing; things had to “just work” or what’s the point? I strived for nearly automatic collection of materials. After a while I noted that this just led me to capture stuff by the truck load, without looking at it ever again. I now enter items sparingly. At that level, the post-processing is trivial, and it also helps to review the actual material. My take is that if I don’t have the time to redact the materials at a minimum level, it’s probably not important enough for me to store.
What is important to me is that the actual capture step is quick and painless; because often something comes up in the middle of another task. The capture has to quickly and reliably go into the inbox. Then, ideally, every day, in a review session, these captured items are sorted into groups (keywords) and further refined (cropping of web clippings etc). This last step is critical: Intense further processing on the spot is no good, as it interrupts the workflow. But if the inbox is not cleaned up daily, I get a huge backlog, and it is a nightmare to catch up on 200 or more inbox items.
I am curious how successful Evernote is with “figuring out what is useful to you in a website and what isn’t”. A lot of script-driven sites are hard to handle, and I figure that you still have to work quite a bit to ensure that you captured what you want. That’s why I like the “capture to pdf” method. Generally (not always), everything is captured correctly. Removing junk by cropping in a review session is infinitely better than realizing (or not) that something is missing. The pdf cropping is only about cosmetics, not information. It can always be done later or omitted, if time is tight.
And ultimately, the captured pdf is only my safety net, in case the website vanishes, and to include that page into my local DT search. Many webpages get updated incrementally, and in the end, I often don’t look at the captured version, but the up-to-date live webpage.