Skip to content

How do I decode escaped unicode javascript code in Python?

I have this string:

V poslednu00edch mu011bsu00edcu00edch se bezpeu010dnostnu00ed situace v Libyi znau010dnu011b zhoru0161ila, o u010demu017e svu011bdu010du00ed i nedu00e1vnu00e9 nu00e1hlu00e9 opuu0161tu011bnu00ed zemu011b nejen u010desku00fdmi diplomaty. Libyi hrozu00ed nekontrolovanu00fd rozpad a nekoneu010d

Which should read “V posledních měsících se …” so u00ed is í and u011b is ě.

Any idea how to decode this in Python? It is a javascript code I am parsing in python. I could write my own ad-hoc solution as there are not that many characters that are escaped (there are only twelve or so accented characters in Czech), but that seems ugly.

Answer

Decode it using the 'unicode-escape' codec. If x is your string, x.decode('unicode-escape').