Complete Guide to URL Encoding and Percent-Encoding
What is URL encoding and why does it exist?
URLs can only safely contain a subset of ASCII characters. Characters outside this set — spaces, Unicode letters, punctuation, emoji — must be percent-encoded so they can be transmitted without being misinterpreted. Percent-encoding replaces each unsafe byte with a % sign followed by two hexadecimal digits representing that byte's value. A space (byte 0x20) becomes %20. An ampersand (byte 0x26) becomes %26. This scheme was defined in RFC 3986 and is supported by every browser, server, and HTTP library. Without percent-encoding, a space in a URL would be interpreted as the end of the URL, and a & would be treated as a query parameter separator.
URL component encoding vs full URL encoding
JavaScript provides two encoding functions with very different behaviors. encodeURIComponent encodes everything except unreserved characters (letters, digits, -, _, ., ~). It encodes /, ?, =, &, #, and : — all the characters that have structural meaning in a URL. Use this when encoding a single value to embed inside a URL. encodeURI (full URL encoding) additionally preserves those structural characters, so it can be applied to a complete URL without breaking it. The rule of thumb: use encodeURIComponent for values (query parameter values, path segments), use encodeURI or the Full URL mode of this tool for complete URLs.
The query string format: parsing and building
A query string is the portion of a URL after the ? character. It contains key=value pairs separated by & characters. Both keys and values are individually percent-encoded. To parse a query string, split on & first, then split each pair on the first = to get the key and value, then decode each separately with decodeURIComponent. To build a query string, encode each key and value with encodeURIComponent, join with =, then join pairs with &. Never encode the & or = separators themselves. The + sign convention for spaces (%20 vs +) is application/x-www-form-urlencoded specific — when in doubt, use %20.
Debugging percent-encoded URLs
Percent-encoded URLs appear frequently in server logs, browser network panels, redirect chains, and API responses. When debugging, paste the raw URL into the decode panel of this tool to get a readable version. Common patterns you'll encounter: %20 or + = space, %2F = /, %3D = =, %26 = &, %3A = :, %40 = @. JWT tokens embedded in URLs use Base64url encoding — use the Base64 tool to decode those. Redirect URLs often contain double-encoded query strings — if your decoded output still contains %XX sequences, decode again. Malformed sequences like %GG or a bare % will cause a URIError and must be fixed before the URL can be used.
URL encoding in common programming languages
JavaScript: encodeURIComponent(value) / decodeURIComponent(encoded). Python: from urllib.parse import quote, unquote — quote(value, safe='') encodes everything including /. Ruby: URI.encode_www_form_component(value). PHP: urlencode($value) (uses + for spaces) or rawurlencode($value) (uses %20). Java: URLEncoder.encode(value, StandardCharsets.UTF_8). Go: url.QueryEscape(value). In all cases, encode individual values — not entire URLs — to avoid double-encoding structural characters.
Unicode, emoji, and non-ASCII characters in URLs
Non-ASCII characters in URLs are handled by first converting the character to its UTF-8 byte representation, then percent-encoding each byte. The emoji 🚀 has the Unicode code point U+1F680. Its UTF-8 encoding is the four bytes F0 9F 9A 80, so it becomes %F0%9F%9A%80 in a URL. Accented characters like é (U+00E9) are encoded as two UTF-8 bytes C3 A9 → %C3%A9. Modern browsers and encodeURIComponent both follow this convention. Internationalized Domain Names (IDNs) like café.com use Punycode encoding at the DNS level (xn--caf-dma.com) rather than percent-encoding.
Security considerations with URL decoding
Always decode URLs on the server before processing, but be aware of double-encoding attacks. A malicious input might encode a slash as %2F, and if your server decodes it after routing, the decoded slash could bypass path traversal protections. The OWASP recommendation is to normalize and decode URLs exactly once, at the entry point, before any security checks. Never trust user-supplied URLs without validation — a URL that appears safe when encoded may decode to a dangerous path or inject unexpected query parameters. Also watch for null byte injection: %00 decodes to a null byte, which can terminate strings in some server-side environments.