Syntax for using regular expression named capture group substitution

The ICU Regular Expressions docs say that ${name} will be replaced with the text matched by a named capture group.

But when I scan text with this regular expression: .+(?\d++).* and do a “Display Alert” action with the text \1 ${number}, what I see in the alert is the matching digit string followed by, literally, ${number}.

What is the right way to do this?

What are you trying to match?

Are you sure that your syntax is correct? Iirc, (?...) defines a non-capturing group. And what is \d++ supposed to match?

1 Like

Let me amend the regular expression to be consistent with a change I made. What I 'm trying now is:
This is stripped-down version of a regex that wasn’t working as I expected.

The plain text file I’m applying the rule to contains just this:


The whole rule looks like this:

When I run it, the result is this:

A non-capturing group is denoted by (?:…).
(?<name>…) is a named capture group.
++ matches the preceding expression 1 or more times, but is called a possessive match. Unlike simple + it will match as many times as possible, and once it’s matched, it prevents backtracking.

1 Like

Thanks for pointing that out. I never use named capturing groups, so I’m not familiar with the syntax.
The name groups might be accessible by number, too (\2, in this case). Did you try that?

Edit To answer my own question: Using \1 \2 in this context works. This seems to indicate that DT’s regex handling is not yet up to dealing with named capturing groups (they were not available initially in Apple’s RE implementation, it seems). Maybe this could be fixed, @cgrunenberg?

This isn’t supported currently, a future release might improve this.

But why are you trying to use a named capture group? Is it really necessary?

Given this text, what would you expect your RegEx to return?

This is something with a number in it: 5.87
123 precedes words.

The regular expression
should match
abc123 such that $1 is “abc” and ${number} is “123”
123 => $1 is “1” and ${number} is “23” (the 1 matches the non-greedy .+?)
s456 => $1 is “s” and ${number} is “456”
This is something with a number in it: 5.87=> $1 is “This is something with a number in it: 5.” and ${number} is “87”
123 precedes words => $1 is “1” and ${number} is “23” (because the second capturing group is not anchored)

This is a simplified example to show the problem, rather than a real use case.

In the more complex real case, named groups would make understanding the regex and the substitution string much easier.

1 Like

Without examples more closely matching real world, it’s difficult to assess, including if named groups are necessary.

Whether or not named groups are necessary in my tiny example should have no bearing on whether they work.

I’ve always believed that when trying to pinpoint why something doesn’t work, a minimal reproducer was a good idea. After all, troubleshooting something small and simple is generally easier that something larger and/or more complex. That’s why I trimmed my real use case down to this tiny example: My real use case wasn’t working, so I went after a tiny example that also showed the same problem: named capture groups not working.

1 Like