I’m fairly new to DevonThink, so forgive me if this has an obvious answer.
I’d like to import a webpage into the View/Edit pane, but I’m not sure how to do it. I drag the icon for the webpage into the View/Edit pane, but all that shows up is the icon itself, not the page. How do I make the actual page show up there?
When you drag a URL into DT, you create a “bookmark” of the web page, which contains zero bytes of text. Selecting that bookmark will open the web page in the DT browser window.
That can be a very useful feature. I’ve created a Bookmarks group, with a number of subgroups such as Scientific Journals, EPA, News, etc. That allows me to build a bookmarks collection of the web sites that I routinely visit.
Now, how to download information from the Internet using these bookmarks?
With a web page open, DT provides contextual menu options to perform captures as Note Captures of selected text, as HTML source (without images) or as a Web Archive (including images viewable offline). Alternatively, you may use (in Safari, DEVONagent or DT/DT Pro) Services > DEVONthink or DEVONthink Pro > Take Rich Note for capturing selected text/images – the keyboard shortcut is “Command-)”.
Note: I do 99+% of information captures as Note Captures, selecting only the text and images that I want to include in my database. Many – if not most – web pages contain extraneous material such as advertisements that I don’t want to download. Many sites, such as the New York Times online site, offer a printable version that eliminates unwanted material.
Note: DEVONthink Pro also provides scripts for various modes of capture of pages from Safari. There’s also a script for ‘printing’ any printable document – including web pages from any browser – as a PDF import to the database. Unfortunately, PDF captures don’t retain hyperlinks in web pages.
If by “icon” you mean the URL, you have created a bookmark in the database. Display the Info panel for that selected item (Shift-Command-I). You will see that a bookmark contains no text, and is not searchable for content. To import searchable text, you would have to capture it using one of the procedures in my first post above.
If, however, you mean by “icon” a local HTML file in the Finder (an HTML page saved on your HD), DT Pro will import the text content of that file.
Tip: If you want to drag web page URLs into your database (creating bookmarks), you can use DT Pro’s floating Groups panel. Here’s how:
[1] In DT Pro’s Preferences > General, UNCHECK “Hide ‘Groups’ panel when inactive.”
[2] Move the Groups panel to the right side of your screen, then minimize it to the Dock (click on the yellow button at top left).
[3] While viewing a web page in any browser, click on the icon of the Groups panel to maximize it, then drag the URL (address field) into the desired location on the Groups panel. When finished, minimize the Groups panel back to the Dock so that it will be ready for future use.
Note: The Groups panel is also convenient for copying selected text/images from any application into your database as a text clipping. However, this doesn’t capture metadata (such as the URL of a web page) along with the clipping. You can also import files from the Finder (e.g., a PDF file or Word file) into a desired location in your database in this way. The file(s) will be imported according to your preferences for that file type.
I like to keep my Dock hidden. Instead, I expand the Groups panel to the full depth of my screen and use the green button to do whatever the green button does, moving that smaller window so that only the title bar is visible at the bottom of my screen. Clicking the green button again pops it up.
I tested this by selecting the above paragraph in OmniWeb and dragged it to the Groups folder, drilling down at least two levels to my devonTHINK Tips folder. In this case, at least, the URL was captured.
I have found, though, that on occasion using Take Rich Note from the Services menu will not capture the URL. Not sure if there’s a pattern there or not–maybe I’m just closing the web window to quickly after hitting command-).
I’ve got Dock preferences set so that the Dock is hidden until I flick the cursor to the bottom of the screen.
In OmniWeb 5.1.3 the Services options for DT Pro only allow plain text captures. So OmniWeb doesn’t appear to be fully compliant with Cocoa/Services standards.
Is a puzzlement!
I use OW 5.1.3 with DT Pro 1.0.2 on OS X (10.3.9) all day. I can access all the Service options for DT Pro, including both Take Rich Note and Append Rich Note using key commands where it’s offered.
Is there a way to “Capture Web Archive” for a whole folder of bookmarks, and have the resulting archives remain in the same folder?
I can only figure out how to do the capture on a page-by-page (not batch) basis… and the resulting archive winds up somewhere else entirely (in my Incoming folder).
Seems like an obvious thing to script but (1) the existing scripts don’t seem to cover it, and (2) my AppleScript talents are clearly not up to the task…
Here’s a script converting all selected bookmarks to web archives (storing them in the same parent group):
-- Convert links to archives
tell application "DEVONthink Pro"
set theSelection to the selection
if theSelection is not {} then
try
activate
show progress indicator "Downloading..." steps (count of theSelection)
repeat with theRecord in theSelection
if type of theRecord is link then
set theName to name of theRecord
set theURL to URL of theRecord
step progress indicator theName
set theData to download web archive from theURL
if exists parent 1 of theRecord then
set theGroup to parent 1 of theRecord
else
set theGroup to missing value
end if
set theArchive to create record with {name:theName, URL:theURL, type:html} in theGroup
set data of theArchive to theData
else
step progress indicator
end if
end repeat
end try
hide progress indicator
end if
end tell
Um… I hate to look gift horses in the mouth, etc, but it promptly crashed DTPro when I ran it. Let me know if you’d like the crash report… by email or here. Thanks!
What I think would be even better is a preference to index the text content of any URL you dragged into DTP in the first place. although i suppose if you wanted to do that you’d just import a web page as a rich note so perhaps not necessary after all.
what i find myself really wanting is some sort of “meta-” web archive function. Where there may be several pages linked from a single page that you want all in one indexed archive, or local cache.
Up at the top there are 6 pages of a bio entry on this guy. It would be heaven if somehow DTP could know to archive / cache all 6 of those pages from the bookmark. No idea how that might be done but…
Right now it’s rather laborious. I create a group called “Surya Paloh” and web archive each page and then drag each of 6 web archive pages to that group.
If for example you could draw a drag box around a group of links in DTP!
For that matter I notice it is not possible to drag the URL icon of a bookmark in DTP to a group, as you could from Safari.
Actually it’s not that difficult, just have a look at this script:
tell application "DEVONthink Pro"
set theSelection to the selection
if theSelection is not {} then
try
activate
show progress indicator "Downloading..." steps -1
repeat with theRecord in theSelection
if type of theRecord is link then
set theName to name of theRecord
set theURL to URL of theRecord
step progress indicator theName
set theData to download web archive from theURL
if exists parent 1 of theRecord then
set theGroup to parent 1 of theRecord
else
set theGroup to missing value
end if
set theArchive to create record with {name:theName, URL:theURL, type:html} in theGroup
set data of theArchive to theData
set theLinks to {}
set theHTML to source of theArchive
repeat with i from 1 to 99
set theFoundLinks to get links of theHTML base URL theURL containing (i as string)
if theFoundLinks is {} then exit repeat
set theLinks to theLinks & theFoundLinks
end repeat
repeat with theLink in theLinks
if not (exists record with URL theLink) then
step progress indicator theLink
set theData to download web archive from theLink
set theArchive to create record with {name:"", URL:theLink, type:html} in theGroup
set data of theArchive to theData
try
set theHTML to source of theArchive
set theName to get title of theHTML
on error
set theName to theLink
end try
set name of theArchive to theName
end if
end repeat
end if
end repeat
end try
hide progress indicator
end if
end tell
This script captures the selected bookmark and all additional numbered pages. Unfortunately there’s a bug in DT Pro and this will require the upcoming DT Pro 1.1.2beta5, otherwise it will capture wrong pages too.
If I try to convert a bunch of selected links (all the links are live) DT Pro crashes when trying to capture this particular link: reidreviews.com/reidreviews/
If I try to capture them one by one (using the script, not the built-in function) the capture works with no crash, including the link above.