Been testing both DV and DTP and find them to be incredible products. Takes a little while to grasp the power and function of both. The more I dig in, the more I like them. One question: I’ve dragged a bunch of documents from my hard drive into a database. When I search, I notice a few documents (some pdfs) are impervious to a search. I assume they are unable to be indexed by DTP. Is that correct or is there something else I can do to make all my documents searchable?
DT Pro may fail to capture the text of a PDF for one of two reasons:
The file has been encrypted to prevent copying of the text; or
The file is image-only and would have to be OCR’s to produce text.
You can test for either case in Preview. Open the PDF in Preview and try to search it.
If it can’t be searched at all, you have an image-only file.
If it can be searched, select some text and try to paste it into TextEdit. If the clipboard was empty, the file is copy-protected.
You can also see if the content of a pdf file is searcheable by looking at the ‘kind’ of file in DTP:
pdf = image (not searcheable)
pdf + text = searcheable.