Skip to content
Advertisement

RegExp matching only the first two entries within a capture group (whatever they happen to be)

I’m currently working on an Adobe inDesign script, part of which is a function that finds measurements and picks them apart. I have a set of regexes that are run first using inDesign’s findGrep() (not really relevant here), and then using the basic javascript exec() (because I need to do things with capture groups).

Now, I know that there are differences between these two regex engines, so I’ve been working to the capabilities of the much more limited JS engine (I think inDesign’s scripting language is based on ECMAscript v3), but I’ve recently hit a problem that I can’t seem to figure out.

Here’s the regex I’m currently testing (I’ve broken up the lines to make it a little easier to read –

JavaScript
  • The first line finds numbers formatted in various different ways.
  • The second line is a lookahead that makes sure I’ve reached the end of the numbers.
  • The third line finds any multipliers that refer to that number.
  • The fourth line is supposed to find any modifiers that go before the unit of measurement.

This is the sample text I was testing it on.

JavaScript

Now when I run the regex using inDesign’s findGrep() it works as expected. When I run it using exec(), however, it does something odd. It will match the numbers and the multipliers just fine, but only “cubic” and “cu” get matched, the “square” and “sq” text is ignored.

To make things more baffling, if I reverse the order of these entries in the regex capture group (so it’s (?:[-s](square|sq.?|cubic|cu.?))? instead), then it only matches “square” and “sq” and not “cubic” and “cu”.

Am I missing something really obvious here? I’m a javascript newbie, but I’ve been working with regular expressions in xslt for years.

JavaScript

EDIT:

So, here’s the code as I’m trying to run it right now.

JavaScript

If I try to run this on my machine, using the inDesign script, it fails to find anything with “square” or “sq”, and when I run it in the code snippet view here it just freezes up. I’m guessing this is something to do with storing regexes as strings, yes?

Advertisement

Answer

I’m not sure if I understand you right. If you want that your second code works in about the same way as your first code does, you probably need just to add "gm" in the RegeExp constructor:

JavaScript

JavaScript

It gives me this output:

JavaScript

Update

I’ve changed (cubic|cu\.?|square|sq\.?) with (cubic|cu\.|cu|square|sq\.|sq) and it seems work in InDesign now:

JavaScript

enter image description here

Probably these ? inside (foo|bar) are too much for InDesign script model.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement