Capture in Smart Rule

I have another feature request: I capture web pages from Safari using the Clip to DEVONthink extension. The source I most commonly use only captures correctly if I capture as HTML; capturing as pdf fails (i.e. doesn’t capture what I see on screen), probably because of log-in requirements.

In DT I then recapture the site by clicking on the cogwheel and selecting capture PDF (One Page); that way the site is correctly captured as pdf.

It would be nice to be able to do that using a smart rule (i.e. inbox, kind HTML, capture as pdf one page, delete) - but whilst conversion to pdf is possible via smart rule (but fails on these files in the same way as initially capturing as pdf does), (re-) capturing is not.

Do you use the clutter-free option? Otherwise you could capture these pages as bookmarks and then convert them to PDF (via smart rules or Data > Convert).

I don’t use the clutter-free option; but using “convert” produces an unusable pdf; I have just tried the bookmark method, and that too leads to a pdf which simply displays the log-in page of the site I have captured from. The only reliable method I have found is to capture as html and then in DT itself recapture as pdf.

Unfortunately a smart rule wouldn’t solve this issue, like AppleScript or Data > Convert it would use a background task to download & render the page (to avoid crashes due to WebKit bugs). The next release will include some improvements related to protected pages but it’s hard to tell whether this will improve things in your case too.

cheers - then I’ll just keep on clicking the button myself :grin:

Am I correct that if save a Bookmark, a later search using DT3 or DevonSphere can only search on the title of the Bookmark, whereas if I Convert to PDF then a search can search on the entire text that is converted?

Yes that is correct. There is no content in a bookmark so logically it can’t be searched by contents. A PDF does have content.

A scheduled smart rule using the conditions Kind is Bookmark and Item does not contain comment could actually index bookmarks using a script like this one:

on performSmartRule(theRecords)
	tell application id "DNtp"
		if (count of theRecords) > 0 then
			show progress indicator "Indexing Bookmarks" steps (count of theRecords)
			repeat with theRecord in theRecords
				try
					set theURL to URL of theRecord
					step progress indicator theURL
					
					if type of theRecord is bookmark then
						set theHTML to download markup from theURL
						set theText to get text of theHTML
						set comment of theRecord to theText
					end if
				end try
			end repeat
			hide progress indicator
		end if
	end tell
end performSmartRule
2 Likes

This works really well - that is extremely useful - thank you

I’ve just modified Cristian’s script to reflect the situation if you had already made some personal comments to the bookmark before indexing (or plan to do it later). Script will download and add the text if it wasn’t added earlier, and separate your comment from this text.

on performSmartRule(theRecords)
	tell application id "DNtp"
		if (count of theRecords) > 0 then
			show progress indicator "Indexing Bookmarks" steps (count of theRecords)
			repeat with theRecord in theRecords
				try
					set theURL to URL of theRecord
					step progress indicator theURL
					
					if type of theRecord is bookmark then
						set theHead to "================================" & linefeed & "Here goes the bookmark text for indexing purposes" & linefeed & "================================"
						set theComment to comment of theRecord
						if theComment does not contain theHead then
							set theHTML to download markup from theURL
							set theText to get text of theHTML
							set theText to theComment & linefeed & linefeed & theHead & linefeed & linefeed & theText
							set comment of theRecord to theText
						end if
					end if
				end try
			end repeat
			hide progress indicator
		end if
	end tell
end performSmartRule

Use a scheduled smart rule with the conditions Kind is Bookmark only.

2 Likes

I’m a little late to the party here but thanks for sharing these scripts!

1 Like