Webarchives and icons

I’ve just started using DEVONthink more regularly. I have used it a bit before so I have a few items in my database from that earlier trial usage and now I have more items from my ‘new’ usage. I’m still trying to decide which view I find most comfortable so you can see I’m very new at this.

Perhaps I’m missing a preference somewhere or just not ‘getting it.’ But my old web archives (perhaps created with OS X 10.3.x vs my currently installed OS 10.4.x) have this nice icon showing a preview of the archive. All my new archives have this awful ugly default type icon showing a picture the same as the Safari icon and the word “ARCHIVE” under it. I know when I started using DEVONthink again recently it downloaded and installed a newer version than what I had on my computer.

To summarize:

  1. Older (perhaps 5 months old) version of DEVONthink, perhaps OS 10.3.x (but I think most likely OS 10.4.x), webarchives have preview type icons. :slight_smile:
  2. Newer (current) version of DEVONthink, OS 10.4.x, webarchives have default type “ARCHIVE” icon :frowning:

What am I doing wrong? I even tried to capture the archives with Safari vs with Firefox 3 and they are still the same. I even just tried using DEVONthink by capturing the bookmark, opening it in the DEVONthink browser and then right mouse (I guess that’s control click) and selecting “Capture Web Archive.”

TIA
Brian

In Preferences > QuickTime check the option to create thumbnails for images.

As to view options, most users prefer the Three Panes or the Vertical Split view. These provide maximum information about organization, groups and documents, especially for large databases. Note that additional sortable columns can be added by choosing View > Columns --.

Thumbnails for web archives are only created if…

  • you capture a web archive within DEVONthink
  • add a web archive from DEVONagent

They’re not created by bookmarklets or scripts for example.

OK, thanks for the help but…

I already have the QT preferences turned on to make thumbnails, both checkboxes (movies and images).

And I tried to make a web archive from within DEVONthink. And I still don’t have an icon :frowning:

If for some reason it matters, here is the one URL I’m using as my testing for this “why doesn’t it work” thread. Figured I’d be better with just 1 instead of trying many different ones.

http://www.jasonmadigan.com/2007/11/building-installing-your-own-osx86-leopard-installation/

And here is an URL from a page that was old in DEVONthink and has an icon. If I make a web archive within DEVONthink with this page I do get an icon, but none of the new pages I tried make them :frowning:

http://eeepc-osx.wikispaces.com/Tiger_Install

I’ve captured both pages successfully inside DEVONthink. However, the icon of the first page is almost invisible as the page is very long and more or less white.

OK, I have another old URL that does work also:

http://newteevee.com/2007/12/15/six-steps-to-get-more-hd-from-your-scientific-atlanta-set-top-box/

And it is just like what you talk about, an icon that’s extremely tall vs wide so its hard to tell there is even an icon there.

But I still can’t get an icon for the leopard on 0Sx86 link.

So in general it seems I’ve got things setup properly, but what am I doing wrong in the case of that 1 particular URL? I tried a few of the others that didn’t work for me and they still all don’t work also.

So there must be something I have in my setup that isn’t in yours, since you say they both work properly for your DEVONthink.

The icon would still have black text though, correct? I’m just making sure that I’m not making a seemingly invisible icon and just not noticing it.

Let me give exactly how I’m attempting to create the archive:

  1. Open URL in Firefox 3.0.1
  2. Use javascript bookmark (from that applet directory on the install disc that I’ve installed in both Safari and Firefox) to create DEVONthink bookmark
  3. Using DEVONthink 1.9.15 I see an icon, “@” and underneath in blue I see the title from the web page (Building & installing your own OSx86 Leopard | Jason Madigan).
  4. Double click that icon to get a browser window within DEVONthink showing the page
  5. Either use the icon/button “Capture the current page” or control click and pick contextual menu item “Capture Web Archive”
  6. Get same result, icon with Safari icon pic and the word “ARCHIVE” and blue text underneath “Building & installing your own OSx86 Leopard | Jason Madigan”

So, on the page(s) that work, I get an icon that is a preview of the page and black text of the title of the page. On the ones that don’t work I get the Safari icon and the text “ARCHIVE” and blue text of the title of the page.

I retried the experiment with Safari instead of Firefox and get the exact same results.

It just worries me that for the ‘broken’ ones that I’m not really getting a local archive of the webpage and if it disappears I’ll be without the page. The reason for my worry was 2 of the pages I’d archived before when I was just playing around with the app are gone. Perhaps they really weren’t archives but links and the pages no longer exist so I have no way to get to that data anymore. If they were just links I’d put that down to user error as I was just starting to use the app and I don’t think I quite understood the difference between a link and a archive and a rich text archive so I could easily have not made an archive but just a link within DEVONthink. But if they were these ‘broken’ type of archives like I’m getting now then I’m worried.

Could it be that I’m using an older PPC Mac instead of an Intel one? Or OS 10.4.x vs 10.5.x? I have a dual 1.67 Ghz G4 here. And unfortunately OS 10.5 isn’t an option, it breaks the one print driver I absolutely have to keep working, my ALPS printer.

The icon of a WebArchive may be of two types, one generic and the other an image of the page, as Christian noted. Both are valid. Select a WebArchive and open its Info panel. If Kind = WebArchive, it’s a WebArchive.

My own preferred method of capturing information from Web pages is selecting the text, images and tables of an article and choosing the rich note Service available in Safari (and other Cocoa applications) – Command-) – which results in capture of a rich text document in the database. I’ve got more than 15,000 such captures in my main database. The advantage is that capture of selected (or printer-friendly) content avoids all sorts of extraneous material that may be on the page, so makes Search and See Also operations more focussed. Tip: select from the bottom of an article upwards, as that is much faster than from the top down.

Firefox cannot capture rich text from Web pages into DEVONthink, because it’s not OS X-Services compliant. I use Firefox once in a while, but never when I want to efficiently capture data into my DT Pro Office database. The exception is that I use Firefox for access to one of my bank sites. I capture records of transactions by “printing” as PDF to my database.

WebArchives and database memory requirements: One of the reasons I rarely capture data in the WebArchive file format is that this increases the amount of memory required to load a DEVONthink 1.x database. In DEVONthink 1.x, text documents (including RTF and RTFD), HTML and WebArchive are stored into the “monolithic” database and are loaded into memory when the database is opened. In those file types, text and images are all taking up memory space. Although my RTF(D) rich note captures also are loaded into memory, they are typically significantly smaller than a WebArchive capture would be. By contrast, PDF, Postscript, image, QuickTime and “unknown” file types are stored in the internal Files folder. So, for a PDF, only the text is loaded into memory, not the images and layout information, when a database is opened. The significance is that the choice of capture mode does influence how much RAM may be needed to open a database, in DEVONthink 1.x.

DEVONthink 2 will have a different database structure. As all documents will be stored in the Finder, this will generally reduce the memory required to load a database.

OK, I can accept there are 2 types of icons if you say there are, but that kind of begs the question “why?”

But I did go back to all the archives I made and opened them within the DEVONthink browser and chose “Capture Note” which I assume is the same as the contextual menu item “Capture Note.” It took a bit of head scratching until I realized I could also get to the Services menu from the contextual menu and that led me to the “Take Rich Note” option.

Are all 3 of these things the same, producing a RTFD file, whatever that is, that’s the smaller than the webarchive type files of the same pages? Sometimes the formatting isn’t very close to the web page and doesn’t look nearly as good but all the data including the pictures and links are there so I think I can live with that.

One thing you say that I find confusing, you talk about “printing” a PDF of your bank transaction page in Firefox “into DEVONthink.” Umm, you mean there is a way to use the print command to make a PDF file that’s only in the DEVONthink database and not located somewhere (else) on my hard drive? (Well actually from what you also said in your post the PDF file is stored in a folder somewhere by DEVONthink and just the text is in the DEVONthink database but I don’t mean this file, I mean having to print a file to say the Desktop and then import it into DEVONthink.)

In DEVONthink Pro and DEVONthink Pro Office (but not DEVONthink Personal) there’s a script, Save to DEVONthink Pro.scpt, that allows one to “print” from any application (except a PDF app like Acrobat) the viewed page as PDF to the database. While viewing a document – a Web page in a browser, a Microsoft Word, Excel or Powerpoint presentation, a Mellel document, etc. --press Command-P to invoke the Print dialog. When the Print panel appears, click on the PDF button and select Save to DEVONthink Pro. Then navigate to the group location into which the new content is to be saved. (DEVONthink Personal users don’t have that convenient script, but can print as PDF to the Finder and import the PDF into their database.)

Re WebArchives: I didn’t mean that you had to convert your existing WebArchives to rich text notes, but of course you can do that if you wish. Depending on the application you are using, the option to capture as a rich text note may be via Services > DEVONthink or a contextual menu option.

Ah OK, that makes sense then why I couldn’t understand what you were talking about, I just have the plain not Pro or Pro Office version. Its not that big a deal to print the PDF to the desktop, for example, then import then delete the original on the desktop.

I didn’t take your post to mean I should convert them. But when I saw a filesize difference between things like 1023 M vs. 220 M it made good sense to me to change from the one style to the other.

After all these were pages with info and so the exact formatting didn’t really need to be kept just the info so I was willing to accept the difference in appearance for the huge size difference. I just still like the idea that for pages where I’d want them to be kept as an archive that I’d see a mini-preview of the web page in DEVONthink and from what you say its hit or miss if I’ll get an preview for the page or a generic icon for the webarchive.

Thanks for answering all my questions so quickly.