Smart Rule Regex

Jmedialis · November 10, 2021, 1:00am

I’m having difficulty with the following Smart Rule. Actions include a Scan Text > Regex followed by Change Name with a capture group. The regex:

$(\d+.\d+)
This

… applied on the following text:

FORMS
Form F
$15.00

DOWNLOAD DOCUMENTS
Insurance Summary to March 2022
$0.50

Subtotal:
$15.50
GST:
$0.78
Payment method:
Credit card (Visa/Mastercard/Amex)
Total:
$16.28
This message was sent to

… fails to run.

Changing the regex to $(\d+.\d+) (without the line break followed by “This”) returns 15.00. Adding to my confusion, using other capture groups (\2, \3 etc) also renders the action inoperable.

Any thoughts why this isn’t working?

Thank you
J

chrillek · November 10, 2021, 11:55am

The $ is special in RE, it stands for the end of the string. Which makes the whole RE non-sensical.

\$ is what you’re looking for.

Edit: I also recommend heading over to regex101.com if a RE is not doing what you think it should be doing. There you can test it agains your input and see exactly where it fails to match. Very helpful!

BLUEFROG · November 10, 2021, 3:10pm

In addition to @chrillek’s advice, have you tried using the Document Amount attribute?

And here’s a quick smart rule adding that amount to the name…

It’ not as controllable as a RegEx, but it certainly is an option.

Jmedialis · November 10, 2021, 6:40pm

That’s a great resource, thanks @chrillek. As you suggested I did some tests. \$(\d+.\d+)\nThis returns exactly what I need. Curiously the same regex pattern in a Smart Rule operating on the same text doesn’t work.

I hadn’t tried Document Amount until you mentioned it, @BLUEFROG . Works like a charm! I suppose it pulls the last $1234.56 match by default. Very handy and fixed my immediate problem. I’m still curious though… why isn’t that regex working?

Thank you
J

Blanc · November 10, 2021, 7:10pm

In the screenshot you have an additional \ before the . - is that intentional (I haven’t read this thread from the top, just noticed in passing by).

Jmedialis · November 10, 2021, 7:25pm

Yes the regex in the screenshot is correct: \$(\d+\.\d+)\nThis

(Typing it here in the message box here seems to be eating my escapes before $ and .)

chrillek · November 10, 2021, 8:45pm

Indeed: in this special case, the unescaped dot would have worked as well, I think. But in general @Blanc: . matches any character (in a RE). In order to match a dot, you have to escape it.

As to why the RE does not work in the smart rule: @cgrunenberg should weigh in on that.

Blanc · November 10, 2021, 9:22pm

Thanks That much RE I actually know, and that \ escapes the following char. (all thanks to helpful folk on this forum, yourself very much included). I was alluding to the fact that the RE which the OP said worked was in fact not identical to the RE then used in the smart rule. The explanation was that the forum software was hungry for escapes

BLUEFROG · November 10, 2021, 9:51pm

You’re welcome.

I’m not seeing an issue with that RegEx in a smart rule.

Jmedialis · November 10, 2021, 10:46pm

All works well in the case when running on a plain text or RTF file (with the previously mentioned content).

I think the issue is stemming from the fact that, in practice, the files operated on by this Rule are emails. Maybe the underlying text in the email (which the Rule is reading) is different from that rendered for display to the user in Devon?

cgrunenberg · November 11, 2021, 7:37am

DEVONthink just uses Cocoa’s regular expression support, see tables on the page NSRegularExpression | Apple Developer Documentation

The conversion to plain text should be the right reference.