I need a regular expression in javascript that matches whole words that start or end with special characters?
It was supposed to be easy, but for some reason b
after ?
doesn’t behave as I expected:
> /FOO?/.exec('FOO? ') [ 'FOO?', index: 0, input: 'FOO? ', groups: undefined ] > /FOO?b/.exec('FOO? ') null
What I need, for instance if my word is “FOO?” (including the question mark), I want to match:
“FOO? is cool”, “do you think that FOO??”
but not: “FOO is cool”, “FOO?is cool”, “aaFOO?is cool”
It should also work for words that start with “?”. For instance, if my word if “?FOO”, I want to match:
“?FOO is cool”, “I love ?FOO”
but not: “FOO is cool”, “FOO?is cool”, “aaFOO?is cool”
I hope it makes sense.
Advertisement
Answer
The b
word boundary construct is ambiguous. You need to use unambiguous constructs that will make sure there are non-word chars or start/end of string to the left/right of the word matched.
You may use
/(?:^|W)?FOO?(?!w)/g
Here, (?:^|W)
is a non-capturing group that matches either the start of a string or any non-word char, a char other than an ASCII letter, digit and _
. (?!w)
is a negative lookahead that fails the match if, immediately to the right of the current location, there is a word char.
Or, with ECMAScript 2018 compatible JS environments,
/(?<!w)?FOO?(?!w)/g
See this regex demo.
The (?<!w)
is a negative lookbehind that fails the match if there is a word char immediately to the left of the current location.
In code, you may use it directly with String#match
to extract all occurrences, like s.match(/(?<!w)?FOO?(?!w)/g)
.
The first expression needs a capturing group around the word you need to extract:
var strs = ["?FOO is cool", "I love ?FOO", "FOO is cool", "FOO?is cool", "aaFOO?is cool"]; var rx = /(?:^|W)(?FOO)(?!w)/g; for (var s of strs) { var res = [], m; while (m=rx.exec(s)) { res.push(m[1]); } console.log(s, "=>", res); }