When you Print to DEVONthink the contents of a Web page, you are creating a PDF version of the Web page – with all it’s text and images – in DEVONthink. Because HTML Web pages are not “parsed” into print-time pages, there’s no way to predict how many pages the PDF representation of the Web page will require (unless, of course, the Web page is very short). So it’s not surprising that “1” Web page can become a 17-page PDF file.
DEVONthink is not a PDF editor. To remove pages from your PDF file, you would need to edit it using the full version of Adobe Acrobat, re-save it, delete the version imported into DEVONthink, and reimport it to DEVONthink. That takes time and effort.
Like you, I don’t want to capture unwanted, extraneous material (such as ads) into my database. Also, many Web pages contain hyperlinks that may be useful, and I want to retain those.
From Safari, if I want to capture only text information from a “messy” Web page that’s full of ads and other unwanted images (and that doesn’t have useful hyperlinks), I can select the whole page or any portion of it, then use Services > DEVONthink > Take plain note. If I do want to retain some images and/or hyperlinks, I would use Services > DEVONthink > Capture rich note.
From DEVONthink’s internal Web browser, I would select all or a portion of the Web page, then use the contextual menu option to Capture Note.
DEVONthink is very good at editing plain text or rich text documents. I can open documents created as above and delete unwanted text and images. Then my database contains only the information I want to keep.
Hint: Many Web pages, such as news sites, offer an option to view a Print version of the page. That version usually doesn’t contain lots of ad images.
I got it. I’m using the Services > DEVONthink > Take plain note and Services > DEVONthink > Capture rich note with abandon now, and it’s great.
Haven’t used the DT internal browser yet, but that’s next. Sounds like it’s easier than the rigamarole of going to Print on a web page and PDFing out of that. Fewer strokes.
BTW, I know I’m referring to a response you gave me in the usage section, but I use Jeter because I have a phenomenal “date memory.” I can remember when I did something, or read something, but not where I stored/parked information about it.
Again, I am indebted for your thoughtful response.