I’m trying to use DTPro to capture some web pages, but on one site (i.e., Netlibrary.com, which I get through my university) the page won’t view in DTPro because it is not a “supported browser.” That made me wonder how DTPro (using WebKit) is identifying itself. It seems like it should be identifying as “Safari” or however Safari IDs itself. That way if it displays in Safari, it will display in DTPro.
DT Pro’s WebKit is a subset of Safari. DEVONagent is, also, but with more browser capability than DT Pro’s.
For some sites, you may need to use Safari, as it can handle Java objects, etc. that DT Pro’s browser cannot. (If you have DEVONagent, try it on your university’s site.)
In DEVONagent the User-Agent is:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit (KHTML, like Gecko) DEVONtech
Safari’s default User-Agent is:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/412 (KHTML, like Gecko) Safari/412
In Safari there is a Debug menu that can be activated through defaults providing other User-Agent capabilties – they are:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.1) Gecko/20020826
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0
Mozilla/5.0 (Macintosh; U; PPC; en-US; rv:0.9.4.1) Gecko/20020318 Netscape6/6.2.2
Mozilla/4.79 (Macintosh; U; PPC)
Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)
Mozilla/5.0 (compatible; Konqueror/3)
Some web server’s discriminate browsers based on how their website was designed – namely, because if they are a site that only supports Windows customers, they do not want tech support calls originating from users running software they don’t support. Therefore the way around these “filters” is to spoof your browser’s identity and often since Safari et al use W3C standards, the sites will work regardless – in other cases YMMV.
Both DT[Pro] and DA need settings to spoof the User Agent to mine some websites. Maybe some default options like the debug menu in Safari – but preferably a text field to be able to define your own user agent.
Good suggestion. I’ll relay your idea to Christian when he gets back from a well-deserved vacation.
But unless the DT Pro browser gets more beef on things like Java, spoofing won’t help if the target site requires features it doesn’t have.
The DA browser is currently more capable, and spoofing might get it accepted.
Note that the university site in question is happy with Safari. They might be persuaded to add DT Pro’s browser if it will work otherwise on their site.
This is an old, old post, but it is worth revisiting six years later…because netlibrary and a few other sites are blocking the useragent string of Devonagent again. Now that the Javascript functionality is up to par and capable of responding to fancy websites, it would be nice to be able to change the useragent string so that websites can’t simply block Devonagent out of the easy bigotry that comes from seeing that it does “deep scans.” It’s far more appropriate, IMHO, to block abusers by IP address, but trying to reason with netlibrary about this is tiring. Is there a way to change the useragent for devonthink/devonagent, perhaps by a “default write” command in the terminal?
I know it’s a specialized thing, but such is the lot of the “information worker.” There’s a lot of sillyness in the wild, wild west of the internet, and some with some sheriffs is better just to not wear black in their town, as they shoot on site. Any way to borrow a white hat?
Okay, I dug a bit further on this, and now I’m not sure that it is the user agent. Webkit nightlies do work fine for signing in to my university system. They have the user string:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/534.36+ (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1
Devonagent sends out a different one (notice the U; and the en;) :
Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/533.21.1 (KHTML, like Gecko) Version 4.0.5 Safari/533.21.1
Obviously, one coudl detect this. And I suspect they are…because whenever I go to my signin page with devonagent it redirects me to a resource forbidden page. Works fine with safari. Or is this some sort of quirk in the way redirects are handled?
The latest version of DT Pro Office (2.5.2) sends an incorrect user agent string, which causes some sites to refuse to render themselves, and the trick mentioned above does not help. Here are steps to reproduce the issue:
Add a new bookmark in DTPO with a URL of trello.com and note that the site complains that the browser is unsupported. Click the “Your browser is unsupported” link and note that the site requires Safari 5.0.5 and above.
It shows that the User Agent String that DTPO is sending is:
Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/536.28.10 (KHTML, like Gecko) Version 4.0.5 Safari/536.28.10
It shows that the User Agent String that Safari is sending is:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10
I’ve searched extensively to see whether there’s a bug in WebKit that might be causing this issue but thus far haven’t found anything.
Interesting. Earlier today I read about some other web client misidentifying itself with “Version 4.0.5” in its User Agent string, but can’t remember/find that reference now (not that it matters).
The UserAgent string that DTPO sends is hardcoded into the executable, /Applications/DEVONthink Pro Office/DEVONthink Pro.app/Contents/MacOS/DEVONthink Pro in several places, no doubt due to static linking of some frameworks when the application is built, but the version string that matters is at offset 0x01a1eeb. Patching that location to be “6.0.3” instead of “4.0.5” caused the UserAgent string sent by DTPO to change accordingly, but www.trello.com still says the browser is too old. The number of bytes for the UserAgent string just happened to be exactly the same length as what is sent by Safari, so I changed them to be identical, and trello.com still complains, even though whatsmyuseragent.com shows them as being identical. Very strange.
What’s even stranger is that hitting whatsmyuseragent.com with the latest version of Chrome lists AppleWebKit/537.31, which is a more recent version than Safari uses.