JavaScript: encodeURI()

Internationalization is no simple mattertm

JS URL Encoding

Sometimes we want to pass a free-text user-input string in the URL, which may be at any language. At work, we used the escape() JS built-in function to make a UTF-8 legal URL. *WRONG*

Problems arised when we added the Italian language; the egrave character (è) isn't translated to a utf-8 representation ("%C3%A8") but to something else ("%E8"). Some reading led me to the conclusion:

escape() may be deprecated, but surely is BAD for URL encoding

We should've used encodeURI(), or encodeURIComponent(). The former is good for a whole URL, thus keeping the url-reserved-chars such as "&", the latter is good for a specific URL component thus also encoding the "&".

A similar PHP issue

Similarly, PHP htmlentities() translates "è" to "è ;", while htmlspecialchars() keeps it in its utf-8 form. The former says "convert ANYTHING that has an html equivalent", the latter says "convert only special chars".

Leave a Reply

Your email address will not be published. Required fields are marked *