my work paystub is hosted on a website thats sucks and there is no date on the pdf file. the pdf does include a date in it when opened : ADVICE DATE 03/29/24
is there anyway to automate via a smart rule and after devonthink has run its OCR the extraction of the date and use it for the filename?
Easy enough especially if the text is always the same.
Write an AppleScript to try getting the text of the document. If the try fails, convert the document to pdf+txt (perform ocr). Once that’s done, you can use AppleScript to find the key phrase (“advice date”) and grab the 8 characters after and parse a date from it. Or just write it to a string “pay stub for 11/12/23.pdf” or whatever.
While @jbmanos isn’t incorrect in the suggestion, since you have a very strictly conforming document, yes you can do this quite simply. Here is an example with no scripting required and notice the document with the date appended to the end of the filename…
If the document doesn’t have a text layer (which you didn’t specify in your inquiry), you could easily add an OCR > Apply action at the beginning of the actions.
Will you really be setting up a smart rule solely for the import of your pay stub?
I use a generic applescript to assist with processing my inbox entries;
it assigns filename (with date), tags, … and moves the item to a filing database
Given the simplicity of the use case, the proposed smart rule is not only useful for the OP and others who may have a similar need but also doesn’t require any knowledge of scripting (which is part of the purpose of smart rules).
My workflow is manual, compared to the smart rule solution
It provides more flexibility and control
I collect files in the Global Inbox
and then process on my Mac, assisted with an applescript
Here’s the boilerplate script
tell application id "DNtp"
set selectedNotes to get selection
repeat with selectedNote in selectedNotes
---- Insert code to set variables
set theNote to move record selectedNote to theFilingGroup
set name of theNote to theTitle
set tags of theNote to theTagList
end repeat
end tell