PDF (ISO32000 & extensions) comments & annotations searching


I’ve repeatedly searched the archives for this content.

PDF is an available standard (adobe.com/devnet/pdf/pdf_ref … chive.html), which contains a way to describe layers of content, some of which are fairly important to the academic user (presence of annotations on pages, text contained in annotations).

There is also another major standard for pdf commenting in the skim way of dealing with PDF.

For technical reasons I’ve extensively commented files using the PDF internal comments / annotations system.

Sadly, however, Apple’s built in tools neither search documents for, nor Spotlight index PDF comments. ( This can be replicated with the command line and /usr/bin/mdimport -d2 test.pdf , where test.pdf contains PDF style comments ).

I’ve just spent a day searching Apple’s open material for source code so as to be able to write a replacement PDF.mdimporter. But, this project seems to be well beyond my capacities, mdimporters being compiled software and no open source exemplar of PDF.mdimporter existing to hack off of.

Do we have an ETA from upstream suppliers (Apple) about when they’ll better index metadata inside PDFs, and search better content? Is there any planned work around at the DT/DN level? Does anyone have pointers to a thread where someone solves this as a workflow? The relevant metadata seems to be a link between PDF page & content of annotation.

There are plans to add this to a future release so that the text of PDF annotations will be indexed and searchable.

Many thanks for the information. I’ll await the update / subsequent version!