REQ: sane titles for Capture Web Archive (Services menu)

(originally posted as a reply)

Is there any way v2’s Capture Web Archive service (huge thanks for that, btw!) could use the page title instead of the last component (after the final ‘/’) of the URL? That would make it optimally effective for my purposes compared to the tedious impracticality of renaming most captures (which often won’t even contain desired selectable text for Set Title As usage).

A title like viewtopic.php?p=32354#p32354 for a captured forum post just doesn’t cut it. :slight_smile:

Thanks.

Not sure if it is a similar issue, but whenever i drag an url to a group it lists the url as the Name instead of the title. Perhaps OSX doesn’t provide the title to Devonthink when copying/dragging. Is there a way to get an url with it’s title into Devonthink?

I have experienced the same issue.
Usually, “cmd+)” (RTF archiving) or “cmd+(” (plain text archiving) gave the title as its first line of selection.
However, “cmd+%” (web archiving) gave a weird title which seemed to be part of web address.

I guess the developer assumed that when doing web-archiving, the HTML filename should have a meaningful title. but I doubt it is, since most of news sites nowadays, and absolutely many other bulletin boards use java script and displays some kind of query of the html file just like this forum.

Hmm, maybe, “cmd+%” is designed to give title from the HTML address, I think. “cmd+)” is doing same thing as cmd+%, only difference is the conversion of HTML to RTF and takes the first line as the title.

In the meantime, somewhere there’s a script from the old days to rename the item’s name to the web page’s title, under the ‘Rename’ Script dir. I don’t remember if this is ‘stock’, or if I found it somewhere on the forum. If you want it and can’t find it I guess it would be alright for me to paste it in here.

It’s the last component (after the final ‘/’) of the URL, as described in my original post.

Problem is there’s no title in the captured content or service clipboard to use.

Christian informed me that Capture Web Archive in the next beta will use the first line for untitled archives, like the Take Plain/Rich Text services. That’s a much better default title for CWA than what’s currently used, and consistent with the other services, so it satisfies my request.

The Rename > To Web Page Title script isn’t a workaround for this problem. I’ve suggested a more sophisticated version that could retrieve the title from the original site (when possible) and use it for the document title.