What regular expression would extract the country name when used with any of the lines below?
I’ve got a dropdown with all of these as choices and I’m trying to extract the country only, but I’m failing miserably since JavaScript doesn’t seem to support lookbehinds and I have no idea how to exclude the emoji part otherwise. (Not to mention that special characters such as that ร in ร land Islands don’t make it any easier.)
Thanks!
๐ฆ๐ซ Afghanistan +93 ๐ฆ๐ฝ ร land Islands +358 ๐ฆ๐ฑ Albania +355 ๐ฉ๐ฟ Algeria +213 ๐ฆ๐ธ American Samoa +1684 ๐ฆ๐ฉ Andorra +376 ๐ฆ๐ด Angola +244 ๐ฆ๐ฎ Anguilla +1264 ๐ฆ๐ฌ Antigua & Barbuda +1268 ๐ฆ๐ท Argentina +54 ๐ฆ๐ฒ Armenia +374 ๐ฆ๐ผ Aruba +297 ๐ฆ๐บ Australia +61 ๐ฆ๐น Austria +43 ๐ฆ๐ฟ Azerbaijan +994 ๐ง๐ธ Bahamas +1242 ๐ง๐ญ Bahrain +973 ๐ง๐ฉ Bangladesh +880 ๐ง๐ง Barbados +1246 ๐ง๐พ Belarus +375 ๐ง๐ช Belgium +32 ๐ง๐ฟ Belize +501 ๐ง๐ฏ Benin +229 ๐ง๐ฒ Bermuda +1441 ๐ง๐น Bhutan +975 ๐ง๐ด Bolivia +591 ๐ง๐ฆ Bosnia & Herzegovina +387 ๐ง๐ผ Botswana +267 ๐ง๐ท Brazil +55 ๐ฎ๐ด British Indian Ocean Territory +246 ๐ป๐ฌ British Virgin Islands +1284 ๐ง๐ณ Brunei +673 ๐ง๐ฌ Bulgaria +359 ๐ง๐ซ Burkina Faso +226 ๐ง๐ฎ Burundi +257 ๐ฐ๐ญ Cambodia +855 ๐จ๐ฒ Cameroon +237 ๐จ๐ฆ Canada +1 ๐จ๐ป Cape Verde +238 ๐ณ๐ฑ Carribbean Netherlands +599 ๐ฐ๐พ Cayman Islands +1345 ๐จ๐ซ Central African Republic +236 ๐น๐ฉ Chad +235 ๐จ๐ฑ Chile +56 ๐จ๐ณ China +86 ๐จ๐ฝ Christmas Islands +61 ๐จ๐จ Cocos Islands +61 ๐จ๐ด Colombia +57 ๐ฐ๐ฒ Comoros +269 ๐จ๐ฉ Congo-Kinshasa +243 ๐จ๐ฌ Congo-Brazzaville +242 ๐จ๐ฐ Cook Islands +682 ๐จ๐ท Costa Rica +506 ๐ญ๐ท Croatia +385 ๐จ๐บ Cuba +53 ๐จ๐ผ Curaรงao +599 ๐จ๐พ Cyprus +357 ๐จ๐ฟ Czechia +420 ๐ฉ๐ฐ Denmark +45 ๐ฉ๐ฏ Djibouti +253 ๐ฉ๐ฒ Dominica +1767 ๐ฉ๐ด Dominican Republic +1 ๐ช๐จ Ecuador +593 ๐ช๐ฌ Egypt +20 ๐ธ๐ป El Salvador +503 ๐ฌ๐ถ Equatorial Guinea +240 ๐ช๐ท Eritrea +291 ๐ช๐ช Estonia +372 ๐ช๐น Ethiopia +251 ๐ซ๐ฐ Falkland Islands +500 ๐ซ๐ด Faroe Islands +298 ๐ซ๐ฏ Fiji +679 ๐ซ๐ฎ Finland +358 ๐ซ๐ท France +33 ๐ฌ๐ซ French Guiana +594 ๐ต๐ซ French Polynesia +689 ๐ฌ๐ฆ Gabon +241 ๐ฌ๐ฒ Gambia +220 ๐ฌ๐ช Georgia +995 ๐ฉ๐ช Germany +49 ๐ฌ๐ญ Ghana +233 ๐ฌ๐ฎ Gibraltar +350 ๐ฌ๐ท Greece +30 ๐ฌ๐ฑ Greenland +299 ๐ฌ๐ฉ Grenada +1473 ๐ฌ๐ต Guadeloupe +590 ๐ฌ๐บ Guam +1671 ๐ฌ๐น Guatemala +502 ๐ฌ๐ฌ Guernsey +44 ๐ฌ๐ณ Guinea +224 ๐ฌ๐ผ Guinea-Bissau +245 ๐ฌ๐พ Guyana +592 ๐ญ๐น Haiti +509 ๐ญ๐ณ Honduras +504 ๐ญ๐ฐ Hong Kong +852 ๐ญ๐บ Hungary +36 ๐ฎ๐ธ Iceland +354 ๐ฎ๐ณ India +91 ๐ฎ๐ฉ Indonesia +62 ๐ฎ๐ท Iran +98 ๐ฎ๐ถ Iraq +964 ๐ฎ๐ช Ireland +353 ๐ฎ๐ฒ Isle of Man +44 ๐ฎ๐ฑ Israel +972 ๐ฎ๐น Italy +39 ๐จ๐ฎ Ivory Coast +225 ๐ฏ๐ฒ Jamaica +1 ๐ฏ๐ต Japan +81 ๐ฏ๐ช Jersey +44 ๐ฏ๐ด Jordan +962 ๐ฐ๐ฟ Kazakhstan +7 ๐ฐ๐ช Kenya +254 ๐ฐ๐ฎ Kiribati +686 ๐ฝ๐ฐ Kosovo +383 ๐ฐ๐ผ Kuwait +965 ๐ฐ๐ฌ Kyrgyzstan +996 ๐ฑ๐ฆ Laos +856 ๐ฑ๐ป Latvia +371 ๐ฑ๐ง Lebanon +961 ๐ฑ๐ธ Lesotho +266 ๐ฑ๐ท Liberia +231 ๐ฑ๐พ Libya +218 ๐ฑ๐ฎ Liechtenstein +423 ๐ฑ๐น Lithuania +370 ๐ฑ๐บ Luxembourg +352 ๐ฒ๐ด Macau +853 ๐ฒ๐ฌ Madagascar +261 ๐ฒ๐ผ Malawi +265 ๐ฒ๐พ Malaysia +60 ๐ฒ๐ป Maldives +960 ๐ฒ๐ฑ Mali +223 ๐ฒ๐น Malta +356 ๐ฒ๐ญ Marshall Islands +692 ๐ฒ๐ถ Martinique +596 ๐ฒ๐ท Mauritania +222 ๐ฒ๐บ Mauritius +230 ๐พ๐น Mayotte +262 ๐ฒ๐ฝ Mexico +52 ๐ซ๐ฒ Micronesia +691 ๐ฒ๐ฉ Moldova +373 ๐ฒ๐จ Monaco +377 ๐ฒ๐ณ Mongolia +976 ๐ฒ๐ช Montenegro +382 ๐ฒ๐ธ Montserrat +1664 ๐ฒ๐ฆ Morocco +212 ๐ฒ๐ฟ Mozambique +258 ๐ฒ๐ฒ Myanmar +95 ๐ณ๐ฆ Namibia +264 ๐ณ๐ท Nauru +674 ๐ณ๐ต Nepal +977 ๐ณ๐ฑ Netherlands +31 ๐ณ๐จ New Caledonia +687 ๐ณ๐ฟ New Zealand +64 ๐ณ๐ฎ Nicaragua +505 ๐ณ๐ช Niger +227 ๐ณ๐ฌ Nigeria +234 ๐ณ๐บ Niue +683 ๐ณ๐ซ Norfolk Island +6723 ๐ฐ๐ต North Korea +850 ๐ฒ๐ฐ North Macedonia +389 ๐ฒ๐ต Northern Mariana Islands +1670 ๐ณ๐ด Norway +47 ๐ด๐ฒ Oman +968 ๐ต๐ฐ Pakistan +92 ๐ต๐ผ Palau +680 ๐ต๐ฆ Panama +507 ๐ต๐ฌ Papua New Guinea +675 ๐ต๐พ Paraguay +595 ๐ต๐ช Peru +51 ๐ต๐ญ Philippines +63 ๐ต๐ฑ Poland +48 ๐ต๐น Portugal +351 ๐ต๐ท Puerto Rico +1 ๐ถ๐ฆ Qatar +974 ๐ซ๐ท Rรฉunion +262 ๐ท๐ด Romania +40 ๐ท๐บ Russia +7 ๐ท๐ผ Rwanda +250 ๐ง๐ฑ Saint-Barthรฉlemy +590 ๐ธ๐ญ Saint Helena +290 ๐ฐ๐ณ Saint Kitts & Nevis +1869 ๐ฑ๐จ Saint Lucia +1758 ๐ซ๐ท Saint Martin +590 ๐ต๐ฒ Saint Pierre & Miquelon +508 ๐ป๐จ Saint Vincent & Grenadines +1784 ๐ผ๐ธ Samoa +685 ๐ธ๐ฒ San Marino +378 ๐ธ๐น Sรฃo Tomรฉ & Prรญncipe +239 ๐ธ๐ฆ Saudi Arabia +966 ๐ธ๐ณ Senegal +221 ๐ท๐ธ Serbia +381 ๐ธ๐จ Seychelles +248 ๐ธ๐ฑ Sierra Leone +232 ๐ธ๐ฌ Singapore +65 ๐ธ๐ฝ Sint Maarten +1721 ๐ธ๐ฐ Slovakia +421 ๐ธ๐ฎ Slovenia +386 ๐ธ๐ง Solomon Islands +677 ๐ธ๐ด Somalia +252 ๐ฟ๐ฆ South Africa +27 ๐ฐ๐ท South Korea +82 ๐ธ๐ธ South Sudan +211 ๐ช๐ธ Spain +34 ๐ฑ๐ฐ Sri Lanka +94 ๐ธ๐ฉ Sudan +249 ๐ธ๐ท Suriname +597 ๐ณ๐ด Svalbard & Jan Mayen +47 ๐ธ๐ฟ Swaziland +268 ๐ธ๐ช Sweden +46 ๐จ๐ญ Switzerland +41 ๐ธ๐พ Syria +963 ๐น๐ผ Taiwan +886 ๐น๐ฏ Tajikistan +992 ๐น๐ฟ Tanzania +255 ๐น๐ญ Thailand +66 ๐น๐ฑ Timor-Leste +670 ๐น๐ฌ Togo +228 ๐น๐ฐ Tokelau +690 ๐น๐ด Tonga +676 ๐น๐น Trinidad & Tobago +1868 ๐น๐ณ Tunisia +216 ๐น๐ท Turkey +90 ๐น๐ฒ Turkmenistan +993 ๐น๐จ Turks & Caicos Islands +1649 ๐น๐ป Tuvalu +688 ๐ป๐ฎ U.S. Virgin Islands +1340 ๐บ๐ฌ Uganda +256 ๐บ๐ฆ Ukraine +380 ๐ฆ๐ช United Arab Emirates +971 ๐ฌ๐ง United Kingdom +44 ๐บ๐ธ United States +1 ๐บ๐พ Uruguay +598 ๐บ๐ฟ Uzbekistan +998 ๐ป๐บ Vanuatu +678 ๐ป๐ฆ Vatican City +39 ๐ป๐ช Venezuela +58 ๐ป๐ณ Vietnam +84 ๐ผ๐ซ Wallis & Futuna +681 ๐ช๐ญ Western Sahara +212 ๐พ๐ช Yemen +967 ๐ฟ๐ฒ Zambia +260 ๐ฟ๐ผ Zimbabwe +263
Advertisement
Answer
Maybe,
Ss([^rn]*?)s*+[0-9]+$
might return the country names in the capturing group $1
.
Using lookaround, we can likely write some expression similar to:
S[A-Za-zรฉรฃ.].*(?=s+[0-9])
which we would get the second letter using,
[A-Za-zรฉรฃ.]
prior to which, there is another S
, and we would then bypass the emojis.
Demo 2
const regex = /Ss([^rn]*?)s*+[0-9]+$/gm; const str = `๐ฆ๐ซ Afghanistan +93 ๐ฆ๐ฝ ร land Islands +358 ๐ฆ๐ฑ Albania +355 ๐ฉ๐ฟ Algeria +213 ๐ฆ๐ธ American Samoa +1684 ๐ฆ๐ฉ Andorra +376 ๐ฆ๐ด Angola +244 ๐ฆ๐ฎ Anguilla +1264`; let m; while ((m = regex.exec(str)) !== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex.lastIndex++; } // The result can be accessed through the `m`-variable. m.forEach((match, groupIndex) => { console.log(`Found match, group ${groupIndex}: ${match}`); }); }
If you wish to simplify/modify/explore the expression, it’s been explained on the top right panel of regex101.com. If you’d like, you can also watch in this link, how it would match against some sample inputs.