Is it possible to search documents that contain specific URLs?
I have tons of PDFs that are generated by Fireshot extension. All URLs in these PDFs are masked as getfireshot.com/xx and I want to detect those PDFs, but searching didn’t do the trick. Any help will be much appreciated.
Post a screen cap of your search and the info inspector for one of the captured files.
As far as I know it’s not possible to search URLs inside a PDF.
You could try Script: Extract PDF URLs to find PDFs that contain or not contain given URLs. It’s possible to
- Filter by URL start.
- Filter by URL end.
- Filter by URL start and URL end.
- Filter can either
"exclude" the passed lists.
so you should be able to find what you’re looking for.
Thanks for suggestion and letting me know about such a great script but I am not sure I understand how it would help me. As far as I understand, your script takes PDF as input, but this is where I am struggling with: finding PDFs. Should I select all PDFs then run the script?
Yes. There’s no other way. Probably easiest to assign a label or a tag to each matching record to collect them.
The script is fast so even with thousands of records you should be done soon. But I wouldn’t run it on all at once, better do batches of some hundred records.
If you need help let me know.
What is this document and where did it come from?
Can you start a support ticket and send me the file?