Are there any equivalent JavaScript functions for Python’s urllib.parse.quote()
and urllib.parse.unquote()
?
The closest I’ve come across are encodeURI()
/encodeURIComponent()
and escape()
(and their corresponding un-encoding functions), but they don’t encode/decode the same set of special characters as far as I can tell.
Advertisement
Answer
OK, I think I’m going to go with a hybrid custom set of functions:
Encode: Use encodeURIComponent(), then put slashes back in.
Decode: Decode any %hex values found.
Here’s a more complete variant of what I ended up using (it handles Unicode properly, too):
function quoteUrl(url, safe) { if (typeof(safe) !== 'string') { safe = '/'; // Don't escape slashes by default } url = encodeURIComponent(url); // Unescape characters that were in the safe list toUnencode = [ ]; for (var i = safe.length - 1; i >= 0; --i) { var encoded = encodeURIComponent(safe[i]); if (encoded !== safe.charAt(i)) { // Ignore safe char if it wasn't escaped toUnencode.push(encoded); } } url = url.replace(new RegExp(toUnencode.join('|'), 'ig'), decodeURIComponent); return url; } var unquoteUrl = decodeURIComponent; // Make alias to have symmetric function names
Note that if you don’t need “safe” characters when encoding ('/'
by default in Python), then you can just use the built-in encodeURIComponent()
and decodeURIComponent()
functions directly.
Also, if there are Unicode characters (i.e. characters with codepoint >= 128) in the string, then to maintain compatibility with JavaScript’s encodeURIComponent()
, the Python quote_url()
would have to be:
def quote_url(url, safe): """URL-encodes a string (either str (i.e. ASCII) or unicode); uses de-facto UTF-8 encoding to handle Unicode codepoints in given string. """ return urllib.quote(unicode(url).encode('utf-8'), safe)
And unquote_url()
would be:
def unquote_url(url): """Decodes a URL that was encoded using quote_url. Returns a unicode instance. """ return urllib.unquote(url).decode('utf-8')