Search inside PDFs?

Much academic material is published in PDFs, but DA doesn’t seem to find these. Am I missing something or will this be added later?

This will be added later (and therefore won’t be available in version 1.0).

I’m a biologist and evaluating DevonAgent as a way to download journal articles from certain journal sites when a new issue comes out. These are all in PDF format. Can DevonAgent download PDFs automatically? This topic seemed to be geared toward searching inside PDFs; all I want it to do is download the PDF when it comes across a link that says ‘Full Text’. Is this possible?

Also, is there a way to have the crawl performed at regular intervals? I thought I’d ask here as I don’t want to start a new thread just for this.


Kurt: This isn’t a direct response to your query, as I think we’ll have to wait for a future version of DEVONagent to perform as you wish.

However, there’s a free Internet plugin called PDF Browser Plugin that I find very useful when doing manual searches of Web sites that contain PDF files. PDF files become visible in your WebKit browser window (both in DEVONthink and DEVONagent current versions) and can be inspected and saved to disk (and PDF Browser inserts the file’s URL into its Finder Info comment field). A major advantage of the plugin is that, without it, PDFs are saved to your disk when the URL is clicked, resulting in the need to do housecleaning to get rid of items you don’t want to keep.

Another note: Although I’ve got the full version of Acrobat, I much prefer Preview in OS X 10.3.x for viewing and searching PDF files. Preview searches are faster than Acrobat searches, and display all occurences of the search term, rather than Acrobat’s single occurrence at a time search. Therefore, I’ve told the Finder, via the Info option to “Open with” a chosen application, to open PDF files with Preview as the default. (Of course, if I need to edit the PDF, I can still open it with Acrobat.)

A future release of DA will support regular automatic crawls (and things like email notifications afterwards), scanning of PDF/Word documents and a download manager (all this should be available in Q2 or this summer).