Skip to content
Advertisement

Regex to split a string into args without breaking the quoted text

I want to take all words from a string and convert them into an array, but i don’t want to break sentences that are enclosed in quotes

My code:

const content = 'this is a simple text that i "need to split into" arguments'
const args = content.split(/ +/g)
console.log(args)

// Result: ['this', 'is', 'a', 'simple', 'text', 'that', 'i', '"need', 'to', 'split', 'into"', 'arguments']

What do I need as a result:

// Result: ['this', 'is', 'a', 'simple', 'text', 'that', 'i', 'need to split into', 'arguments']

Advertisement

Answer

One simple approach would be to use string match() along with the regex pattern ".*?"|w+. This pattern will eagerly first try to find a next doubly-quoted term. That failing, it will search for a single word. This approach avoids the possibility of consuming words which appear inside double quotes.

var content = 'this is a simple text that i "need to split into" arguments';
var matches = content.match(/".*?"|w+/g);
for (var i=0; i < matches.length; ++i) {
    matches[i] = matches[i].replace(/^"(.*)"$/, "$1");
}
console.log(matches);
Advertisement