Running OCR in an Indexed folder for DTP3?

I’ve seen that there are old forum discussions re: OCR & Indexed Files, and am wondering if there’s another approach for DTP3.

Here’s my situation: In DTP3’s Preferences, I’ve clicked the box for Original Document - Move to Trash. But when I select PDF files stored in an Indexed folder, and run OCR->to searchable PDF, there are still the original files; and instead of replacing the old files, DTP3 just adds a number suffix to the file(s). This creates a lot of confusion and disorganization.

How can I can fix this so that when I select PDF files in my my Indexed folders, and run OCR->to searchable PDF, they replace the original files (moving them to Trash) and not just adding a number suffice to the file(s)?


You can’t fix it, as that’s how the mechanism works with indexed files.

Perhaps the name could be cached, the file OCR’d, the original deleted, and the cached name applied. However, I’m talking out of my hat as I am not the developer handling OCR. @aedwards would have to comment on the feasibility of this.

Got it. And understood.

Well…I would welcome any ideas / suggestion (from @aedwards, et. al.) on how it might be possible to do the kinds of things that you just mentioned that might remediate this.

Thanks for your help, @BLUEFROG.

I dont know your specific situation but here’s a smart rule that does OCR on images or PDFs with no text layer (zero words) and leaves the name intact.

I have indexed an OCR In and an OCR Out folder because that’s how I think and also to be in the habit of making sure files aren’t available to reprocess by the smart rule. In this case it’s not necessary as the resulting PDF should no longer match the criteria but it’s good to be aware of best practices.

