dump all Safari tabs into devonthink---using Python's Appscr

hello,

Here’s a script that dumps all open tabs in safari (or webkit) into devonthink. I do a lot of browsing in Safari, but when I want to dump all the windows and quit safari, I like to dump all those links into devonthink and I can sort things out later. Hopefully some of you will find it useful.

Of note, the script is written in Python. I decided to write this to test out appscript and see how well one can code system events, devonthink and an app like safari with Python. The syntax for interfacing with Python is a bit gross, as you will see, but the idea of being able to use Python’s regex functions and all the other goodies with devonthink’s browser and cookies is very exciting. Oh, if only devonthink really supported javascript, this would be amazing. But for those of you interested in scripting Devonthink Pro with Python, but who were stymied for lack of an example, this at least is a start. I plan to write a few more scripts for devonthink in python over the next while, and will post here. Please tell me if you find this useful, enlightening, or stupid.

cheers,
erico


#!/usr/bin/env python 
# Safari_to_devonthink_dump by Eric Oberle
# Uses Appscript for Python and Devonthink Pro by Devon technologies

# This script dumps all the currently open webkit and safari tabs into devonthink, provided that they are not already in devonthink 
# By default, the script creates a folder on the root of devonthink to contain all the links.
# The script could be easily modified to make webarchives or pdfs instead of links, but links are very fast.
# I use Daniel Jalkut's Fastscripts to run this every time my Safari/Webkit instance becomes sluggish---
# it's an easy way to save your work! 
# 
# It is probably best to install appscript, the python applescript extension, before you try to run this. 
# You can do this from the terminal by typing "easy_install appscript" 


try:
	from appscript import *
except:
	os.system("easy_install appscript")
	from appscript import * 
	
from osax import *
from re import *
import osax
import os
import sys, EasyDialogs
import time
se = app("System Events")

def dump_to_devonthink(app_name):
	safari= app(app_name)
	dt=app("Devonthink Pro")
	T = time.asctime()
	print "Dumping all safari links " + T; 
	the_time = "links-"  + T
	destination=dt.create_record_with({k.type:k.group,k.name:the_time},in_=dt.databases[1].root)
	tabs_to_close = []; links_saved = []

	for this_window in reversed(safari.get(safari.windows,timeout=3600)):
		if not safari.get(this_window.document) : continue   # skip the "download" and other non-document bearing windows 

  		if safari.get(this_window.tabs) > 1 : 
  			try:
  				for this_tab in safari.get(this_window.tabs):
		  			tabs_to_close.append(this_tab)
  		  			the_url = safari.get(this_tab.URL);
  					the_name=safari.get(this_tab.name)
  					print the_name + ":  " + the_url + "\n"
  					if not dt.exists_record_with_URL(the_url):		
  						dt.create_record_with({k.type:k.link,k.URL:the_url,k.name:the_name},in_=destination)
  						links_saved.append(the_url)
  						# uncomment these lines to create webarchive instead
  						#the_source = safari.get(this_tab.source)	
  						#dt.create_record_with({k.type:k.webarchive,k.source:the_source,k.name:the_name})
  					#print safari.get(this_tab.properties)
  			except: print "Error  " + str(this_tab) 

  		else:
  		 	the_url =  safari.get(this_window.URL);
  	  		the_name = safari.get(this_window.name);
  	  		print (the_url) + " : " + (the_name)
 	  		if not dt.exists_record_with_URL(the_url):
  				dt.create_record_with({k.type:k.link,k.URL:the_url,k.name:the_name},in_=destination)
  				links_saved.append(the_url)
  			tabs_to_close.append(this_window)
  	
  	print str(tabs_to_close);			
 	while tabs_to_close : this_tab=tabs_to_close.pop(); print "closing ..." + str(this_tab) ; safari.close(this_tab); 
 	if not links_saved: dt.delete(record=destination,in_=dt.databases[1].root)
 	     safari.quit()

if (se.processes[its.name=='Safari'].get()) <> [] :dump_to_devonthink("Safari")

if (se.processes[its.name=='Webkit'].get()) <> [] :dump_to_devonthink("WebKit")

Erico,

This is very cool. I haven’t gotten it to work just yet and don’t have time to debug, but I’ve been doing a lot of work with Python and this is motivation enough to start using Appscript. If you have any more Python-based scripts, do share…

Thanks,
Dan

Another option that has builtin regex and the pleasure of writing in JS syntax is the JavaScript OSA component. But like python, the syntax to drive apps and map applescript datatypes can get a little wonky. Hmm, you might be forced to use Script Debugger for coding, however. I just tested Apple’s script editor and it doesn’t seem to recognize the JS OSA component like it used to. I will ask Mark A. about that.

http://www.latenightsw.com/freeware/JavaScriptOSA/index.html

My understanding is that the JS OSA component has no future. If it were Java, that would be another thing, perhaps. But I don’t see JS as being very viable in the future as an IAS tool.

I think scriptbridge (from apple) does, which gives one a bunch of languages. Appscript (the earlier python implementation seems to be well supported, and works well for me, and python has a great shell and data typing structures.

I’ll post a few more python scripts here if there is an interest. I tend to use it only when I need to do heavier duty stuff. Applescript works pretty well for lighter weight stuff. Python’s easy_install system of packages, which includes things like Beautiful Soup. Well, that’s heavy duty goodness, and AS is never going to have that! And though JS libraries exist, they are no where nearly as well supported, easy to use, or durable, as the Python ones.

I’ll post some scripts that use Beautiful Soup & Expect that over the next while…

-erico

Anyone interested in JSTalk?

Can someone explain to me how they got this to work. This looks like a script I will use 10 times a day so I definitely am pumped but running into errors on compile and execute. I tried working with it in textmate and these were my results, be easy newbie here! Thanks for the script!

"Error in sys.excepthook:
Traceback (most recent call last):
File “/Applications/Code/TextMate.app/Contents/SharedSupport/Bundles/Python.tmbundle/Support/sitecustomize.py”, line 76, in tm_excepthook
message += “, %s” % arg
TypeError: not all arguments converted during string formatting

Original exception was:
File “untitled”, line 41
if safari.get(this_window.tabs) > 1 :
^
IndentationError: unexpected indent
"

I suspect the problem you are having (and probably most difficulties running this script are on account that python treats indent spaces as meaningful syntactically. What that means is that as your error message indicated, you need to have “the correct” level of indentation for each statement. This can drive people nuts, and it is not my favorite aspect of python. The way to fix it is to go to the line in question, and make sure that it either lines up with the one above it, or make sure it is one tab width intended form the line above it, if the line above it ends in a colon. The latter is the case here. So you should see code that looks like

if [condition 1]: if [condition 2]: stuff stuff stuff

In the case of the line that you quoted, the second “if” statement should be clearly one tab indented from the one above it.

Try changing that and rerunning it. Sometimes, I end up deleting all the spaces, back up to the previous line, hitting return, and then typing in new tabs—such is life with python, but it has many other nice qualities!
-erico

Thanks for the tips, I will try to play around with the script when I have a chance and let you know if I have success or not. Thanks again for the explanation.

Cheers,
cbd