Import of bookmarks from a HTML page

Hi, I want to import a few bookmarks in a DTPO database (they are exported by Pins app as links on an HTML page) and I have a few doubts about the procedure:

  1. The important question: what is the most efficient way to have the job done? The number is not that huge so my plan is/was: import the HTML page in DTPO, open each link in a separate tab in the “three panes” preview tab (⌘-clicking them) and then capture all tabs as PDF in a single step. Problem is, it seems that I have to capture sequentially each tab. Is it possible to perform the desired multiple tabs capture and how?
    An even better solution would be to select some links in the original HTML page (one link on each row) and then run a script that creates a PDF for each link. Has anybody developed something along this line?

  2. I want to visualize the destination group(s) in “as Icon” mode to have a quick visual clue about the site content but it seems that Bookmarks don’t have a thumbnail when visualized in “as Icon” (see screenshot). Is that correct? Not really a showstopper since I can save them as PDF (as in the screenshot) but saving some space would be a plus.

  3. Related to previous point but valid as a general remark on the thumbnails in “as Icon” view: a PDF thaumbnail will disply a whole page of the PDF so if the file is an unpaginated webpage the thumbnail will be a narrow strip practically useless (see screeshot). As for the point 2 above I can save as paginated PDF but in general it would be nice to have thumbnails of this long page PDF created in a sort of “scale to page width” mode.

Thanks in advance for any help!


1 Like

Are you interested in capturing Bookmarks from an HTML page, or capturing the HTML Pages that those bookmarks open? Your first few sentences can be read either way.

Have you looked at DEVONthink’s contextual menu when an HTML page is previewed in the application? You’ll find several options there. Including adding links to the Download queue, so their destination pages can all be captured in one process.

1 Like

Hi Korm, thanks for the reply and sorry for the confusion.
I want to capture a Bookmark of the destination pages or (preferably) paginated PDF of the page (reason for preference is point 2 of my original post).
I cannot attach a sample file but here’s how it looks when visualised in Three Panes view:

screenshot_69.png
Now, as for your suggestions:

  1. Adding to Download Manager queue: you have to do it link by link; some of them will not download (didn’t investigate if it’s a broken link issue or other reasons) and most important the page will be downloaded as HTML nested in a folder structure (the hierarchy on the server) so collecting them is a huge work.

  2. Capture link: works fine. Only problem you have to do it for a single link at a time.

I was hoping for a single step procedure to capture all linked pages.

1 Like

So, you could write a script to extract all the links and put them into an RTF. There are examples on MacScripter.net. Or, there are Chrome and Firefox extensions that will do this in an instant.

E.g., Link Klpper on Chrome produced all the links in this page we’re both reading in a second. Open the page from DEVONthink into Chrome, run Link Klipper, saving to txt. Boom. Done.

1 Like

Thanks for the tip on Link Klipper!
Reading your post made me realise that if we have to leverage on a browser to extract objects form a HTML page we already have a powerful tool: DEVONAgent Pro!

So, I opened the page in DA, select all links in the “All Links” tab of the “Inspector Bar” and then run a “Add Link to DEVONthink”. Voila, all links saved as Bookmark in a selectable group.
If you want you can select “Download” which will download all destination pages as HTML.

A last small detail left: as I wrote in point 2 of my original post Bookmarks (and HTML as well) don’t show a thumbnail in “as Icon” view.
I can easily batch convert to unpaginated PDF but this kind of file will show the useless narrow stripe thumbnail (see point 3 in OP).
I also tried to convert to paginated PDF; thumbnails are big enough to be recognized but unfortunately they will not reproduce accurately enough the website appearance.

It seem that this thread will end up rising a feature request to DEVONtech guys: is it possible to have thumbnails for Bookmarks and HTML files and visualise more readable ones for unpaginated PDF?

Thanks.

2 Likes

I get great thumbnails from HTML and bookmarks

these are my settings in Preferences > Media > Create Thumbnails

1 Like

Strange; same settings here.

screenshot_70.png
After a restart the situation was better. I only had a few HTML without thumbnails (see last item on the right).
screenshot_71.png
I manually created them and then everything seems fine.

Now I can import all my bookmarks without problems!
Thanks again for your help Korm.

P.S. Still valid the request to have a more legible thumbnail for long unpaginated PDFs.

1 Like