Pdf to html converter?


The problem I’m currently having is that in DT, I can’t view highlighted ‘found’ text in the searched pdf document. It seems I currently have two choice (please correct me if I’m wrong);

  1. To right click the pdf in the DT score window and then choose ‘launch’ to open in Acrobat Reader, then perform the search again (through Acrobat Reader).

  2. To convert the pdf to plain text or rtf.

The first method is time consuming, and although the second works okay (particularly with Textlightning), I cannot view any original images in these formats after the conversion. For this reason, I’m currently working with both rtf and pdf versions of the same document in DT, but this doesn’t seem ideal.

I thought the best way to go would be to convert pdf to html, so hopefully I can have my highlighted search text results and view the images as well.

I’m using Mac OS 10.2.6, and I was wondering if anyone could recommend a freeware or reasonably priced ‘pdf to html’ converter.


Looks like Panther’s Quartz engine will provide highlighting of PDF documents (and of course we will support this in DEVONthink as soon as Apple provides a release date for Panther)

Thanks Mr. Grunenberg :slight_smile:

I have tried searching for an OSX pdf to html converter, but no luck as yet. There does however seem to be a small number of converters for the Windows platform. Maybe I could do the conversion in Virtual PC, or just wait for Panther.


I don’t have the URL handy, but search Adobe’s site.  There is an online PDF to HTML converter there (and links to other sites that offer the same).  It is free, at last check.  You could use a little Applescript and make the process quicker…


I think the page you are referring to is;

adobe.com/products/acrobat/a … _form.html

Unfortunately. their FAQ states;

The Access conversion technology was developed to allow blind and visually impaired users to read Adobe PDF documents with speech synthesis software. For this reason, graphic elements are stripped from the file and text is reformatted during conversion.  :’(

I’m hoping that Apple’s Panther and DT will provide the means to do what I would like.