Skip to content
Advertisement

Extract Variables From String Regex

This might be a repeat question but I’m not sure how to look for the answer 😛 I’m trying to extract and remove variables from a string.

The string might look like this: !text (<123456789>=<@$111111111>) (<7654312> = <@$222222222>) (🛠 =<@$3333333333>) Some text that I will need!

I need the two items in each block? e.g. [["123456789", 111111111],['7654312','222222222'],["🛠","3333333333"]]

Then I need the string exactly but with the variables removed? e.g. Some more text that I will need!

I’m not sure of the best way to do this, any help is appreciated.

Advertisement

Answer

You don’t always have to use regexes, for instance why not write a parser? This gives you much more flexibility. Note that I added <> around the 🛠 for simplicity, but you could make brackets optional in the parser.

The parser assumes anything that isin’t within () is free text and captures it as string nodes.

For instance if you wanted only the last text node you could do…

const endingText = parse(text).filter(t => typeof t === 'string').pop();

const text = '!text (<123456789>=<@$111111111>) (<7654312> = <@$222222222>) (<🛠> =<@$3333333333>) Some text that I will need!';

console.log(parse(text));

function parse(input) {
  let i = 0, char = input[i], text = [];
  const output = [];
  
  while (char) {
    if (char === '(') {
      if (text.length) output.push(text.join(''));
      output.push(entry());
      text = [];
    } else {
      text.push(char);
      consume();
    }
  }
  
  if (text.length) output.push(text.join(''));
  
  return output;
  
  function entry() {
    match('(');
    const key = value();
    whitespace();
    match('=');
    whitespace();
    const val = value();
    match(')');
    return [key, val];
  }
  
  function value() {
    const val = [];
    match('<');
    while (char && char !== '>') val.push(char), consume();
    match('>');
    return val.join('');
  }
  
  function whitespace() {
    while (/s/.test(char)) consume();
  }
  
  function consume() {
    return char = input[++i];
  }
  
  function match(expected) {
    if (char !== expected) throw new Error(`Expected '${expected}' at column ${i}.`);
    consume();
  }
}
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement