Smart rule: rename based on pdf content possible?

Hi all

my work paystub is hosted on a website thats sucks and there is no date on the pdf file. the pdf does include a date in it when opened : ADVICE DATE 03/29/24

is there anyway to automate via a smart rule and after devonthink has run its OCR the extraction of the date and use it for the filename?

best

Z

Easy enough especially if the text is always the same.

Write an AppleScript to try getting the text of the document. If the try fails, convert the document to pdf+txt (perform ocr). Once that’s done, you can use AppleScript to find the key phrase (“advice date”) and grab the 8 characters after and parse a date from it. Or just write it to a string “pay stub for 11/12/23.pdf” or whatever.

For more ideas, see this thread: Automatic Renaming of Receipts to RCPT YYYY-MM-DD for $XX.XX

3 Likes

While @jbmanos isn’t incorrect in the suggestion, since you have a very strictly conforming document, yes you can do this quite simply. Here is an example with no scripting required and notice the document with the date appended to the end of the filename…

If the document doesn’t have a text layer (which you didn’t specify in your inquiry), you could easily add an OCR > Apply action at the beginning of the actions.

2 Likes

wow @BLUEFROG

mind blown :exploding_head::exploding_head::exploding_head::exploding_head:

this is beyond amazing!!!

thx so much as always devonthink rocks!!

best

Z

1 Like

You’re very welcome :smiley:

Will you really be setting up a smart rule solely for the import of your pay stub?
I use a generic applescript to assist with processing my inbox entries;
it assigns filename (with date), tags, … and moves the item to a filing database

1 Like

Given the simplicity of the use case, the proposed smart rule is not only useful for the OP and others who may have a similar need but also doesn’t require any knowledge of scripting (which is part of the purpose of smart rules).

2 Likes