I have a field in my application where users can enter a hashtag. I want to validate their entry and make sure they enter what would be a proper HashTag. It can be in any language and it should NOT precede with the # sign. I am writing in JavaScript.
So the following are GOOD examples:
- Abcde45454_fgfgfg (good because: only letters, numbers and _)
- 2014_is-the-year (good because: only letters, numbers, _ and -)
- בר_רפאלי (good because: only letters and _)
- арбуз (good because: only letters)
And the following are BAD examples:
- Dan Brown (Bad because has a space)
- OMG!!!!! (Bad because has !)
- בר רפ@לי (Bad because has @ and a space)
We had a regex that matched only a-zA-Z0-9, we needed to add language support so we changed it to ignore white spaces and forgot to ignore special characters, so here I am.
Some other StackOverflow examples I saw but didn’t work for me:
[edit]
- Added explanation why bad is bad and good is good
- I don’t want a preceding # character, but if I would to add a # in the beginning, it should be a valid hashtag
- Basically I don’t want to allow any special characters like !@#$%^&*()=+./,[{]};:'”?><
Advertisement
Answer
If your disallowed characters list is thorough (!@#$%^&*()=+./,[{]};:'"?><
), then the regex is:
^#?[^s!@#$%^&*()=+./,[{]};:'"?><]+$
This allows an optional leading #
sign: #?
. It disallows the special characters using a negative character class. I just added s
to the list (spaces), and also I escaped [
and ]
.
Unfortunately, you can’t use constructs like p{P}
(Unicode punctuation) in JavaScript’s regexes, so you basically have to blacklist characters or take a different approach if the regex solution isn’t good enough for your needs.