clipping; suggest revision to Instapaper option

I suggest the Instapaper option in the Clip to DEVONthink extension be revised in several ways for a PDF output.
a) create a configuration preference to exclude the link (in top line of PDF) to the Instapaper website
b) improve recognition of which parts of webpage should be included
If the bookmarklet consists of just a JavaScript and CSS file, is this something a mere user might be able to do?

===== background

In order to get a static and nearly-complete record of a webpage, I have chosen PDF as my preferred format for import into DtPO (DEVONthink Pro Office 2.3.5). Content of the output varies with procedure, but seems essentially invariant to choice of browser: Safari (5.1.7), DEVONagent Pro (3.2), Chrome (20.0.1132.57), Firefox (13.0.1).

To illustrate the variety of outputs, I will summarize some experiments involving “Desirable Difficulties in the Classroom” by Jeff Bye (psychologyinaction.org/2011/ … classroom/). This online article has a bulleted list, two links, and a bibliography; the webpage includes four comments. [Note: following summary is a bit tedious, but Bye’s article is not — I recommend it.]

  1. article only + Instapaper link at top + 3 icons at end 98 KB PDF+Text
    (any browser) : Clip extension : PDF (1 page) : Instapaper

  2. article + bibliography 78 KB PDF+Text
    Safari : Reader : Print : PDF (1 page) : Save PDF to DEVONthink Pro

  3. article + bibliography + comments + remainder of webpage 378 KB PDF+Text
    (any browser) : Clip extension : PDF (1 page) [Instapaper not selected]

  4. webpage with minimal formatting (disable background so that text is not obscured)
    Safari : Print : PDF : Save PDF to DEVONthink Pro 225 KB PDF+Text
    DEVONagent : Print : PDF : Save PDF to DEVONthink Pro 200 KB PDF+Text
    Firefox : Print : PDF : Save PDF to DEVONthink Pro 183 KB PDF+Text
    Chrome : Print : Destination = Save as PDF : Print using system dialog … : Save PDF to DTP 538 KB PDF (without “+text”)

  5. [selecting text of page, then invoking a Service can yield Text, RTF, or RTFD]
    select title and first several lines, scroll to bottom, shift-click at end to select all
    DTPo : Take plain note 16 KB Text
    DTPo : Take rich note 38 KB RTFD

     if only article and bibliography are selected (omitting icons & images is probably key)
    

    DTPo : Take rich note 13 KB RTF
    [Note: 3 comments are significant and I would not omit them.]

===== my preference(s)

The RTF and RTFD versions preserve the bulleted list and the two links, but paragraphs end with an unjustified line and no other visual clue. My first editing effort to demarcate paragraphs with an extra blank line and remove icons took only 2 minutes, but I overlooked 3 paragraph transitions. Storage is cheap and my time is not, so my current choice among these alternatives is (3): whole webpage, near replica of display in browser window.

For the page found by the second link in Bye’s article, procedure (1) suffices [even though I have no intention of ever using the Instapaper website]. Had there been no meaningful comments, then I would choose procedure (2) [if Safari:Reader can recognize the bibliography, why can’t Instapaper?]. For me, the only unambiguous decision is that procedure (4) is always worse than something simpler. [If only a fragment is to be clipped, then procedure (5) seems good.]

The Instapaper clipping option relies on the Javascript in the “Instapaper Text” bookmarklet provided by Instapaper. (See Instapaper’s “Extras” page). Since the program logic that enables the layout that DEVONthink produces is executed from Instapaper’s site, I’d suggest writing to Marco to see if the customizations mentioned in @mtRBL’s post are possible by modifying the Javascript or otherwise.

An alternative (work around) is to use the “Instapaper Text” bookmarklet in your browser, then use Printliminator to modify the look and feel of that output, and save the result as a PDF to DEVONthink. This is of course more steps than desired, but at least there’s a way to get the job that’s already available.