There is a not common issue with DT that a PDF is marked as without Text but it clearly has. This should happen on batch PDF importing or a lot of changes in indexed PDFs, but not always. And it is not reproducible. It simply happens.
I have this Smart Group for each Database and in global:
Then, so and then, I get something like this:
It really is a PDF with Text (I’m completely sure as I scanned it from mo books and Windows Abbyy OCRed myself). But DT says it is no text PDF.
Normal way to resolve this is:
Copy PDF outside DT or indexed folder.
Drop into DT saved PDF or move into indexed folder.
Normally, DT should recognise it as PDF+Text
However, I’ve found a faster and less dangerous trick:
Highlight something in the PDF.
Save or wait until DT saves modified file by itself.
After some time, it is recognised as PDF+Text.
Remove selected text.
That is. No need to risk lose the file, or lose external tags/comments.
(I think it could be a synchronization issue when file is being “touched” by different DT threads and/or macOS. For example, it happens a lot with some files if you index different items/folders in two databases at same time, or you update more than one folder with indexed PDFs. I try to avoid that, but sometimes it simply happens with normal daily work. All my indexed files are or in Dropbox or in iCloud Drive, always local, never placeholders).
Perhaps an script that will re-read the PDFs in selected folder will make the issue less relevant.
It is a random thing. Sometimes are web scrapped files (2/3 pages from Safari printing), others are own scanned books/magazines (from 2 to 100 MB, 100 to 600 pages)… It happens completely randomly. And not frequently, but happens.