Text in Web Archives isn't indexed

Hi,

I’m using DT Pro 1.0.2, and I’m having a problem with Web Archives, the text in pages captured as Web Archive doesn’t seem to be indexed by DT, and so DT doesn’t show it when doing searches and it doesn’t show any option when using the Classify button. All this works fine with pages captured as HTML. Is this a bug? How can I make DT index the text in web archives?

Thanks,
Arkaitz

Arkaitz,

how did you create the web archive?

I’m using a script to get the current page captured as web archive from OmniWeb. It’s based on a script that I think you posted in the forums when someone asked for an easier way to create web archives programmatically. I’ve checked, and if I do “Capture Web Archive” from a DT browser window the resulting page is indexed, but not if I import it with the script. I’ll copy the script here, thanks in advance for any help in this, I’m very new to Applescript so I may be doing something stupid.

Arkaitz

tell application “DEVONthink Pro”
with timeout of 30 seconds
tell application “OmniWeb”
set theTab to the active tab of browser 1
set theURL to the address of theTab
end tell

	set theRecord to create record with {name:"Temporary Link", type:nexus, URL:theURL}
	set theWindow to open window for record theRecord
	
	repeat while loading of theWindow
		delay 1
	end repeat
	
	set theURL to URL of theWindow
	set theSource to source of theWindow
	set theName to get title of theSource
	set theData to web archive of theWindow
	set theGroup to create location "/Incoming"
	set theArchive to create record with {name:theName, type:html, URL:theURL} in theGroup
	set data of theArchive to theData
	
	delete record theRecord -- Closes window 
end timeout

end tell

There’s a bug in DT Pro 1.0.x and therefore you have to modify the script slightly:


set theSource to source of theWindow
set theArchive to create record with {name:theName, type:html, URL:theURL, source:theSource} in theGroup 

That works great, thanks!

Arkaitz