today I was working with my database after move them from my imac to my ibook and I have realised that all my web captures doesn´t keep the images inside the html document, when I have a internet connection they load without problems but it doesn´t keep a copy of this images inside my database, how could I achieve this?
That’s the nature of HTML page captures. Only the text and the hyperlinks to images are captured.
The most efficient way to capture text and the images you want would be to do rich text captures (also know as note captures) of selected material. I say most efficient, because one can choose not to capture unwanted images, or can easily edit them out after capture. Generally, I do at least 90% of my data captures from the Web as rich text captures.
Another way, which captures all images, would be to save a page as a Web Archive and import it into DT Pro or, if using DEVONagent or the browser in DT Pro, use the contextual menu option to import the page as a Web Archive. Note that Web Archives are Macintosh only; they cannot be read by Windows users.
Still another way is to capture a Web page as a PDF document. DT Pro installs a script at YourBootVolumeName/Library/PDF Services/ named Save to DEVONthink Pro.scpt. To use is while viewing a Web page, select File > Print. When the Print panel appears, press the PDF button, then select Save to DEVONthink Pro.scpt and choose the location for the import. Note: Hyperlinks on the Web page will not work in the PDF capture.
I saw the current thread just as I was about to write and ask about importing web sites (File-Import Site…). There are a couple of sites I want to import as a web archive. The problems I have are several: sometimes all that’s imported is the index.html file; sometimes I get lots of files but things still seem to require being connected to the web to view them (I’m not talking about going to other sites which are linked because I certainly don’t want to suck all those, too). I really want the sites in an archive in my DTPro database and not need to connect to the web.
I read the following in the DT Help file:
Import Site…: Opens the “Download Manager” and downloads a complete web page/site for archiving and offline viewing. Make sure the download options are set correctly, especially the options that define which links DEVONthink Professional should follow (if any). All links within the site are modified so that they point to the downloaded images or other embedded objects. This ensures that the page/site can be displayed at any time.
I don’t understand what “options that define which links DTP should follow.” I don’t see anything about this in Preferences->Import and I don’t see anything in the dialogue box that pops up when you choose File->Import Site…
There’s a lot of data I need to get into DT for a big research project I’m working on, and if I could suck these sites successfully so that I can work the site while not online, it would help me tremendously.
Any help understanding Import Site… would be greatly appreciated.
So, in Safari, I select what I want and then use the script “Add selection …” or “Add text to DevonThink”? Is this pretty much the equivalent of Add page to DT, except that in your method, the images themselves can actually be captured?
This looks like a wonderful idea, I had not seen that script and I have immediately made a test, first with Safari, then with OmniWeb.
However, in both cases, it didn’t work: it seems to do the printing job as usual (without asking me for the location, by the way), but then suddenly I get a “Printing error” message, and my browser window remains frozen. No way to move anything, sidebar remains greyed instead of becoming blue, and the only way is to shut down and relaunch the browser.
Do you have any idea how I could solve that problem and make the script work without the printing error? It would really be nice to be able to convert webpages into PDF like that!
Yes. Locate the “Save to DEVONthink Pro.scpt” file in /Users/YourUserName/Library/PDF Services/. Double=click on the script to open it under Script Editor. Press Command-S to save it back to the same location. Now it will work.
Note: I often do rich text captures of selected text/images to retain working hyperlinks and also to avoid obtrusive advertising. The browsers in DEVONagent and in DT Pro offer convenient contextual menu options for choosing that option or to capture as HTML or as a Web Archive.
Thank you, Bill! I have done as instructed, and it works! So easy to do… but I had not figured how to do it. I must say that I am a relatively recent “convert” to Mac: I was on PC until 2004, and then switched to Mac, scripts remain a somewhat mysterious world for me.
Thank you also for your advice about Rich Text Capture. Obviously, there may be different capturing strategies depending upon the context.