Searching annotations in PDFs

I can’t seem to figure out how to search for text in annotations of PDFs stored in DTP. Is this supported? If not, is this something Devon plans to add in the future?

Has anyone developed any workarounds for this short of opening each pdf in Acrobat and searching that way?

Note annotations made in Acrobat and other PDF viewer/applications are not part of the searchable text layer.

I haven’t used such notes myself in a very long time, for that reason and also because they are also only plain text, meaning no hyperlinks or images can be included. Finally, I work with a variety of document filetypes and don’t want to be limited to notes that work only with a single filetype, so I’ve been using rich text notes, associating them with the referenced document by name and a hyperlink. That has always worked for me. I got a kick out of mbywater’s critique of my approaches. I don’t have a female iguana, although I do like a good boulibase. See viewtopic.php?f=2&t=9280

A number of users have objected that my techniques for creating and associating rich text notes with references are cumbersome. I don’t find that to be the case, as I fold the tasks into the time used for thinking about the topic at hand. I often think all too slowly, so there’s no time penalty for my note procedures. :slight_smile:

This is indeed a red-letter day for those who like automation! Eric posted today a preview of a new smart template that automates the process of associating a new note to its referenced document. See viewtopic.php?f=2&t=9434

And if that note should refer to page 76 of a 316 page PDF, the Page Link to page 76 can be inserted into the note. Credit Christian for that one.

Now DT Pro/Office 2 will allow anyone to create as many notes (for footnotes) about a reference as would a 19th century German historian – but much more easily. Saints preserve us.

Thanks for the reply, Bill.

While the new annotation template is very handy in a variety of contexts, I am truly disappointed that Devon is unable to search the text made in pdfs.

Acrobat is able to search this text. Are there any plans to incorporate this into DTPO?

I haven’t checked Acrobat 9, but Adobe Reader 8 is unable to find or search for the text content of text notes added to a PDF. Adobe Reader 8 can search for text in comments, but not text notes (specifically defined as text notes). I just confirmed that again by adding text notes to a couple of PDFs using Acrobat.

The problem is that ‘add-ons’ to a PDF such as text notes and text comments are not in the normal text layer of a PDF, which can be read by all PDF viewer applications. The problem is further compounded by the fact that text notes in PDFs are not managed in the same way by various applications that can create them. Apple includes in OS X its own PDF application, Preview, which is much more widely used by Mac users than is Acrobat. DEVONthink uses PDFKit (based on Preview) to render and edit PDFs within databases.

So the answer is that, unless and until Apple adds the ability to search for PDF text notes in PDFs (including those added by Acrobat), do not expect Acrobat text notes to be searchable in DEVONthink. There’s probably more hope for searchable text notes created by Skim, an Open Source project.

Acrobat’s Search procedure searches PDFs one by one, and so is very slow for a large collection of PDFs. However, Acrobat Pro also provides an indexing system, so that indexed searches of a collection can be searched much more rapidly.

Some years ago I had a project resulting in a collection of hundreds of PDFs, that was constantly changing. It was a developing collection of standard operating procedures for technical procedures in a governmental agency. I was using Acrobat Pro 5 to create indexes of the collection. Unfortunately, each time PDFs were added the index had to be rebuilt, a relatively tricky and time-consuming procedure. (Text notes were not indexed by Acrobat, of course.)

That was the time DEVONthink 1.4 appeared, and I found it awe-inspiring compared to the primitive Adobe indexing and search system. Very importantly, I was no longer limited just to PDF documents. The information content of other filetypes could also be included in searches.

That’s when I completely dumped using text notes in PDFs. I had never liked them, as they were only plain text, and ‘defaced’ the PDFs with that note symbol (I’m a nut about that). I moved over to creating searchable rich text notes that I associated in various ways with referenced documents, whatever their filetype. That gave me far and away more power in making and using my notes than could be provided by text notes in PDFs (even had they been searchable). DTPO2 (especially in the upcoming public beta 8 ) simplifies some of those note-taking procedures.

But that’s just me. I prefer procedures that reduce limitations and are ‘richer’, which is the main reason I don’t use text notes in PDFs. :slight_smile:

My mistake - I’m talking specifically about text that is entered in what Acrobat calls Sticky Notes. Preview calls them Notes, as does DTP.

I use Adobe Acrobat 8 (full version - not the reader) which does allow text inside of sticky notes to be searched.

This type of annotation is very important in my work. Not only do I have hundreds of pdfs littered with this type of annotation, there are also professional journals in my field that publish articles with sticky notes embedded in the pdf.

To be able to create and view text in notes in a pdf in DTP but then not be able to search for that text seems rather odd given DTP’s search muscle.

If, in fact, DTP cannot search this type of text, can this post serve as a request for that feature? Or, do I need to re-post in the feature request forum?