I’ll leave it to heavy-duty scripters to come up with procedures to trigger automatic OCR of image-only PDFs in Indexed folders.
Here’s an approach to create a smart group that will list all the PDFs that have been Index-captured to your database(s), and that contain less than ten words of text.
Open the full Search window (Tools > Search). Set it to search ‘Databases’ (so that all open databases will be searched). Click on the ‘Advanced’ button and enter the criteria as shown in the screenshot.
Hit Return to invoke the search.
Why did I enter a number for the Word count? Because some PDFs, e.g., from sites that provide PDFs of newspaper clippings, etc. do not OCR the image of an old newspaper article, but do include searchable text as to the source of the article. If you work with such PDFs, experiment with a word count suitable to display these in the smart group.
Click on the ‘+’ button to the right of the query field to save the search as a smart group, and name it, e.g., ‘Indexed PDF OCR Candidates’. As it is for all open databases, this smart group will be saved in the left Sidebar (which displays the Global Inbox).
Remember to reset the ‘Advanced’ button in Search when finished.
It’s probably best not to batch select all the PDFs listed in the smart group, as that might result in moving items from the groups in which they are currently filed.
If Preferences > OCR has the option to move the original PDF to the Trash CHECKED, the original will be deleted. You may find it useful to select a PDF and press ‘Command-R’ (the Reveal command) to see it in the group where it is filed. From that location, select the PDF and choose ‘Data > Convert > to searchable PDF’.
However, as the searchable PDF is stored within the database, and isn’t currently indexed, there’s an option to move it to the external folder that had been Indexed. Select the PDF, Control-click and choose the contextual menu option to move it to the external folder.
Finally, select the group corresponding to that external folder and choose ‘File > Synchronize’. Now the searchable PDF is Indexed, and is among the items listed within the Indexed group.