Matching whole words that start or end with special characters

Tags: ,



I need a regular expression in javascript that matches whole words that start or end with special characters?

It was supposed to be easy, but for some reason b after ? doesn’t behave as I expected:

> /FOO?/.exec('FOO? ')
[ 'FOO?', index: 0, input: 'FOO? ', groups: undefined ]
> /FOO?b/.exec('FOO? ')
null

What I need, for instance if my word is “FOO?” (including the question mark), I want to match:

“FOO? is cool”, “do you think that FOO??”

but not: “FOO is cool”, “FOO?is cool”, “aaFOO?is cool”

It should also work for words that start with “?”. For instance, if my word if “?FOO”, I want to match:

“?FOO is cool”, “I love ?FOO”

but not: “FOO is cool”, “FOO?is cool”, “aaFOO?is cool”

I hope it makes sense.

Answer

The b word boundary construct is ambiguous. You need to use unambiguous constructs that will make sure there are non-word chars or start/end of string to the left/right of the word matched.

You may use

/(?:^|W)?FOO?(?!w)/g

Here, (?:^|W) is a non-capturing group that matches either the start of a string or any non-word char, a char other than an ASCII letter, digit and _. (?!w) is a negative lookahead that fails the match if, immediately to the right of the current location, there is a word char.

Or, with ECMAScript 2018 compatible JS environments,

/(?<!w)?FOO?(?!w)/g

See this regex demo.

The (?<!w) is a negative lookbehind that fails the match if there is a word char immediately to the left of the current location.

In code, you may use it directly with String#match to extract all occurrences, like s.match(/(?<!w)?FOO?(?!w)/g).

The first expression needs a capturing group around the word you need to extract:

var strs = ["?FOO is cool", "I love ?FOO", "FOO is cool", "FOO?is cool", "aaFOO?is cool"];
var rx = /(?:^|W)(?FOO)(?!w)/g;
for (var s of strs) {
  var res = [], m;
  while (m=rx.exec(s)) {
    res.push(m[1]);
  }
  console.log(s, "=>", res);
}


Source: stackoverflow