Prevent regex from becoming greedy when using optional tokens?

Question

I'm trying to use regex to extract information from different strings. For example, I have the following JSON: and want to write a regex that extracts into capture groups (1) the text up to the colon, (2) the text up to the comma, (3) the comma if exists, and (4) the text after the comma. Starting with the comma being

Accepted Answer

Let the second group capture anything that is not a comma nor a line break:(.*?): ([^,nr]*)(,?)(.*?)nNote that your regex requires the line to end with n. This may be too strict, as the last line of a text might not terminate with n. And there are also texts that use r or rn as line break. You might want to use the $ anchor, which also does not actually capture the line break, but just requires it. Use with the m (multiline) modifier.

Advertisement

Answer