I often have to work with pdfs. Yes, I know the advice, just avoid pdf. But I don’t have that choice.
Sometimes scanned pdfs are missing text.
Sometimes pdf-born-pdfs have corrupt text, or lose their text after processing in Ghostscript.
So far, my only option is to check each pdf in turn, and check the text. I’ve tried variations on wc -w in the command line, but they often report a word count for pdfs which have lost all words. For English-language pdfs, a quick search for pdfs which don’t contain “the” should find any which lack text, or have severely corrupted text. So can Devonsphere search for files which which don’t contain a given term?