I am trying to write some regex pattern that will look through a sentence and remove any one or two sequentially repeated words
for example:
# R code below string_a = "hello hello, how are you you?" string_b = "goodbye world goodbye world, I am flying to the the moon!" gsub(pattern, "", string_a) gsub(pattern, "", string_b)
Desired outputs are
[1] "hello, how are you?" [2] "goodbye world, I am flying to the moon!"
Advertisement
Answer
Try
gsub("(\S+(\s+\S+)?)\s+\1+", "\1", c(string_a, string_b))
-output
[1] "hello, how are you?" [2] "goodbye world, I am flying to the moon!"