Hallo zusammen.
Ich bin auf der Suche nach einer Möglichkeit aus einer PDF+Text Datei das Belegdatum auszulesen. Leider reichen meine Applescript Kenntnisse nicht soweit, dass ich die Lösung von Grund auf erstellen kann.
Hier im Forum bin ich leider nur auf einen einzigen Thread gestossen, welcher sich mit dem Parsing eines Datums aus einem Beleg beschäftigt und im Kern genau die Lösung für mein Problem aufzeigt.
[url]Automatic Renaming of Receipts to RCPT YYYY-MM-DD for $XX.XX]
Das Script von ‘jbmanos’ unterstützt standardmäßig aber leider nicht das deutsche Datumsformat. So habe ich versucht den Code zu kürzen und entsprechend seiner Kommentare so anzupassen das er nur nach dem Belegdatum in der Form DD.MM.YY oder DD.MM.YYYY sucht und mir das zunächst nur im Dialog anzeigt.
Es funktioniert nicht und ich trete seit zwei Tagen auf der Stelle.
Daher meine Frage und Bitte - hat von euch jemand bereits ein funktionierendes Script welches das deutsche Belegdatum ermittelt oder kann mir jemand die Hand führen wie ich das anhängende Script anpassen muss, damit dieses funktioniert und ich zu meinem Datum
komme.
Ich danke euch im voraus.
Efty
set mypattern to "[0-9]\\{1,2\\}[./-][0-9]\\{1,2\\}[./-][0-9][0-9]"
tell application id "com.devon-technologies.thinkpro2"
try
set theselection to the selection
if theselection is {} then error "Bitte mindestens einen Beleg auswählen."
repeat with theRecord in theselection
set rcdtext to "Oops. Es ist etwas nicht richtig gelaufen."
try
set rcdtext to the plain text of theRecord
on error
display alert "Die PDF Datei enthält keinen OCR Text."
end try
set mycat to rcdtext as text
try
-- first attempt to find date. the pattern here looks for DD.MM.YYYY
do shell script "echo " & quoted form of mycat & " | grep -o " & quoted form of mypattern & "[0-9][0-9]"
set myDate to first paragraph of result
--fix that pesky M format to MM
if the first word of myDate is in {"1", "2", "3", "4", "5", "6", "7", "8", "9"} then
set myDate to "0" & myDate
end if
--fix that even peskier D format to DD
if the second word of myDate is in {"1", "2", "3", "4", "5", "6", "7", "8", "9"} then
set myDay to the first word of myDate
set myMonth to "0" & the second word of myDate
set myYear to the third word of myDate
set myDate to myDay & "." & myMonth & "." & myYear
end if
-- write the preferred date string
set myDate to the third word of myDate & "." & the first word of myDate & "." & the second word of myDate
on error
-- if you are here in the code, then there was not a four YYYY code and grep spits out some nasty error, which is good because now we can try the 90% pattern for (m)m-dd-yy
try
do shell script "echo " & quoted form of mycat & " | grep -o " & quoted form of mypattern
set myDate to first paragraph of result
-- fix that pesky M format to MM
if the first word of myDate is in {"1", "2", "3", "4", "5", "6", "7", "8", "9"} then
set myDate to "0" & myDate
end if
--fix that even peskier D format to DD
if the second word of myDate is in {"1", "2", "3", "4", "5", "6", "7", "8", "9"} then
set myDay to the first word of myDate
set myMonth to "0" & the second word of myDate
set myYear to the third word of myDate
set myDate to myMonth & "-" & myDay & "-" & myYear
end if
--note to self. there was something special about this line, but by the time you are looking for it, we'll have both forgotten.
--write the preferred date string for record names
set myDate to "20" & the third word of myDate & "-" & the second word of myDate & "-" & the first word of myDate
on error
--at this point, there are no numeric designations of month in a delimited list on your receipt. The very, very intrepid could now begin to search for the more obscure formats by embedding deeper try blocks, for instance, this one right here could look for 'YY MMM dd forms, but well, I get tired of branch logic after a while and figure that we've explained about 98% of U.S. receipt dates to our little expert system with the first two. If you like getting things more precise, add more date pattern tries here.
-- This is to make the failure to find a date stand out.
set myDate to "XX.XX.XXXX"
end try
end try
end repeat
display dialog myDate
end try
end tell