Skip to content
Advertisement

How to convert a string of camelCase identifiers to a string with space-separted words, while replacing the separator?

I have studied the answers to “how to use regular expressions to insert space into a camel case string” and several related questions, and the code below will produce the string

Word Double Word A Triple Word UPPER Case Word

Unfortunately, it’s necessary to have a separator where {TOKEN} appears in the input. Ideally, the result would have comma separators

Word, Double Word, A Triple Word, UPPER Case Word

Is there a way to do that with a single regex? (It would be okay for the regex replacement to result in a string with a leading comma.)

Here’s the code that I have so far:

const regex = /({TOKEN})|([A-Z])(?=[A-Z][a-z])|([a-z])(?=[A-Z])/g;
const str = '{TOKEN}NormalWord{TOKEN}DoubleWord{TOKEN}ATripleWord{TOKEN}UPPERCaseWord';
const subst = '$2$3 ';

const result = str.replace(regex, subst);

Advertisement

Answer

It does not look pretty, but you may use it like

const regex = /(^(?:{TOKEN})+|(?:{TOKEN})+$)|{TOKEN}|([A-Z])(?=[A-Z][a-z])|([a-z])(?=[A-Z])/g;
const str = '{TOKEN}NormalWord{TOKEN}DoubleWord{TOKEN}ATripleWord{TOKEN}UPPERCaseWord';
const result = str.replace(regex, (g0, g1, g2, g3) =>
  g1 ? "" : g2 ? `${g2} ` : g3 ? `${g3} ` : ", "
);
console.log(result); // => Normal Word, Double Word, A Triple Word, UPPER Case Word

The (^(?:{TOKEN})+|(?:{TOKEN})+$) alternative will capture {TOKEN}s at the start and end of the string, and will remove them completely (see g1 ? "" in the replacement callback method). {TOKEN} will signal a normal token that must be replaced with a comma and space. The rest is the same as in the original regex.

Note that in the callback, g0 stands for Group 0 (the whole match), g1 for Group 1, etc.

Advertisement