Tools Guides

URL Encoding Explained: Percent-Encoding and Why It Matters

Why URLs can't contain spaces or special characters, what percent-encoding does, and when you need to encode or decode a URL.

BoxTool Editorial Updated May 27

URL Encoding Explained: Percent-Encoding and When You Need It

You've seen URLs like https://example.com/search?q=hello%20world. That %20 is URL encoding (also called percent-encoding) for a space. But why do URLs need this? And when should you encode or decode URLs in your own work?

The Problem: URLs Have Reserved Characters

A URL is composed of components — scheme, host, path, query, fragment — each separated by specific characters:

https://example.com/path/to/page?key=value&other=data#section
^^^^^   ^^^^^^^^^^^  ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^   ^^^^^^^
scheme  host         path         query               fragment

Characters like /, ?, &, =, and # have structural meaning. If any of these appear in your data (not as separators), the URL parser gets confused.

For example: a search query for C++ vs C# contains + and #, both of which have special URL meanings (+ means space in query strings; # starts a fragment).

URL encoding solves this by replacing unsafe characters with % followed by the character's hexadecimal ASCII code.

Percent-Encoding Reference

Character Encoded Why it needs encoding
Space %20 (or + in query strings) Invalid in URLs
+ %2B Means "space" in query strings
# %23 Starts fragment
% %25 Starts percent-encoding
& %26 Separates query params
= %3D Separates key/value in query
? %3F Starts query string
/ %2F Separates path segments
: %3A Separates scheme from host
@ %40 Username/password separator
[, ] %5B, %5D IPv6 delimiters

Safe Characters (Never Encoded)

RFC 3986 defines these as "unreserved" — they can appear in any URL component without encoding:

A-Z  a-z  0-9  -  _  .  ~

Everything else must be encoded when appearing in data values (not as structural separators).

Spaces: %20 vs. +

There are two common encodings for spaces:

  • %20 — strict percent-encoding. Used in paths and anywhere outside query strings. Part of RFC 3986.
  • + — used in query strings only, based on the older HTML form encoding spec (application/x-www-form-urlencoded). A + in a path is a literal plus sign.

This distinction catches many developers off guard:

/search?q=hello+world     → "hello world" (query string context)
/files/hello+world.pdf    → file literally named "hello+world.pdf"

When in doubt, use %20. It's correct in all URL contexts.

Encoding in Different Languages

JavaScript (browser)

// For encoding a complete URL
encodeURI("https://example.com/path?q=hello world")
// → "https://example.com/path?q=hello%20world"

// For encoding a value within a URL
encodeURIComponent("C++ vs C#")
// → "C%2B%2B%20vs%20C%23"

// Decoding
decodeURIComponent("C%2B%2B%20vs%20C%23")
// → "C++ vs C#"

Key distinction: - encodeURI() — does NOT encode structural characters (:, /, ?, =, &, #) - encodeURIComponent() — encodes everything except unreserved characters

For encoding individual query parameter values, always use encodeURIComponent().

Python

from urllib.parse import quote, unquote, urlencode

# Encode a path segment
quote("hello world")           # "hello%20world"
quote("C++", safe="")          # "C%2B%2B"

# Encode query string
urlencode({"q": "C++ vs C#", "page": 1})
# "q=C%2B%2B+vs+C%23&page=1"  (uses + for spaces in query)

# Decode
unquote("hello%20world")       # "hello world"

Command line (curl)

# curl automatically encodes URLs, but for explicit control:
curl -G "https://api.example.com/search"      --data-urlencode "q=hello world"      --data-urlencode "tags=C++ programming"

Internationalized URLs: Punycode and Percent-Encoding

URLs are technically ASCII-only. Non-ASCII characters (like Korean, Chinese, Arabic, emoji) are handled two ways:

Domain names: Converted to Punycode — a special ASCII encoding. - münchen.dexn--mnchen-3ya.de

Path and query values: Converted to UTF-8 bytes, then percent-encoded. - /ko/검색?q=안녕/ko/%EA%B2%80%EC%83%89?q=%EC%95%88%EB%85%95

Modern browsers display the decoded form in the address bar, but the actual HTTP request uses the encoded form.

Common Real-World Scenarios

Sharing URLs in emails or messages

Paste a URL into an email, and the email client might break it at special characters. If your URL contains & or other reserved characters in the path (not as separators), encode them first.

OAuth and API authentication

Most OAuth implementations require query parameters to be percent-encoded in a specific, strict way for signature generation. A single wrong encoding will cause the signature check to fail.

Building URLs programmatically

Never concatenate user input directly into a URL string:

// ❌ Dangerous — breaks if userName contains & or =
const url = `https://api.example.com/user/${userName}`;

// ✅ Safe
const url = `https://api.example.com/user/${encodeURIComponent(userName)}`;

CSV exports with file paths

File names with spaces or symbols in download URLs need encoding:

/files/My Report (Final).pdf
→ /files/My%20Report%20(Final).pdf

Encode or decode URLs instantly: URL Encoder/Decoder →

Try the tool

Open tool
{# Alpine.js — self-hosted. (The previous jsdelivr CDN tag had a stale SRI integrity hash, so the browser refused to run it and window.Alpine was never defined — silently breaking every FAQ accordion and Alpine tool.) #}