About searchable PDFs?


I’m test driving DTP Office.

If the Kind element of the Info Panel says “PDF + Text,” does that mean the document is a searchable PDF?

For a PDF + Text document would Data > Convert > to Searchable PDF have any additional advantages?



Yes, PDF+Text indicates that there is a text layer that can be indexed.

In general there is no advantage to convert these, but it seems some academic paper distributions only have a searchable cover page and the rest is a bitmap image. In such cases it can help to convert them to searchable PDF.

Is there any way to tell how big the text layer is?

Would an AppleScript to copy text to clipboard be an adequate indicator?



It displays how many words there are in the document but not the size of the text layer only. But in AppleScript you can ask for the plain text of the record and then you can calculate the size.

Here’s a quick test: Scroll downwards a few pages in the PDF and double-click on or attempt to select some text. Is it selected? If so, it’s searchable. If not, it’s image-only.

A smart group that searches for all PDFs and is sorted by word count is an easy way to browse this condition…

Best, Charles