I know there are easier ways to get file extensions with JavaScript, but partly to practice my regexp skills I wanted to try and use a regular expression to split a filename into two strings, before and after the final dot (. character).
Here’s what I have so far
const myRegex = /^((?:[^.]+(?:.)*)+?)(w+)?$/ const [filename1, extension1] = 'foo.baz.bing.bong'.match(myRegex); // filename1 = 'foo.baz.bing.' // extension1 = 'bong' const [filename, extension] = 'one.two'.match(myRegex); // filename2 = 'one.' // extension2 = 'two' const [filename, extension] = 'noextension'.match(myRegex); // filename2 = 'noextension' // extension2 = ''
I’ve tried to use negative lookahead to say ‘only match a literal . if it’s followed by a word that ends in, like so, by changing (?:.)* to (?:.(?=w+.))*:
/^((?:[^.]+(?:.(?=(w+.))))*)(w+)$/gm
But I want to exclude that final period using just the regexp, and preferably have ‘noextension’ be matched in the initial group, how can I do that with just regexp?
Here is my regexp scratch file: https://regex101.com/r/RTPRNU/1
Advertisement
Answer
For the first capture group, you could start the match with 1 or more word characters. Then optionally repeat a . and again 1 or more word characters.
Then you can use an optional non capture group matching a . and capturing 1 or more word characters in group 2.
As the second non capture group is optional, the first repetition should be on greedy.
^(w+(?:.w+)*?)(?:.(w+))?$
The pattern matches
^Start of string(Capture group 1w+(?:.w+)*?Match 1+ word characters, and optionally repeat.and 1+ word characters
)Close group 1(?:Non capture group to match as a whole.(w+)Match a.and capture 1+ word chars in capture group 2
)?Close non capture group and make it optional$End of string
const regex = /^(w+(?:.w+)*?)(?:.(w+))?$/;
[
"foo.baz.bing.bong",
"one.two",
"noextension"
].forEach(s => {
const m = s.match(regex);
if (m) {
console.log(m[1]);
console.log(m[2]);
console.log("----");
}
});Another option as @Wiktor Stribiżew posted in the comments, is to use a non greedy dot to match any character for the filename:
^(.*?)(?:.(w+))?$