URL Encoder Decoder Guide: Percent Encoding, Special Characters & API Development

Published: April 2025 • 11 min read • Web Development

URL encoding (also called percent encoding) is one of the most fundamental concepts in web development, yet it's also one of the most common sources of bugs. A single unencoded space or ampersand can break a link, corrupt a form submission, or cause an API call to fail silently. Understanding how URL encoding works — and when to apply it — is essential for anyone building for the web.

This guide explains what URL encoding is from the ground up, walks through which characters need encoding and why, shows practical examples in multiple programming languages, and covers the specific encoding challenges that arise in API development.

What Is URL Encoding?

URL encoding is a mechanism for converting characters into a format that can be safely transmitted over the internet. The basic idea is simple: any character that isn't part of the "safe" ASCII character set gets replaced with a percent sign (%) followed by two hexadecimal digits representing the character's byte value.

For example, a space character becomes %20, an ampersand becomes %26, and a Chinese character like "中" becomes %E4%B8%AD (in UTF-8 encoding). This transformation ensures that URLs only contain characters that are universally safe for transmission through web infrastructure — browsers, servers, proxies, and firewalls.

Why URLs Need Encoding

URLs were originally designed to use a limited subset of ASCII characters. The URL specification (RFC 3986) defines which characters are allowed and what they mean in different parts of a URL. Characters like ?, &, =, /, and # have special structural meaning in URLs — they separate query parameters, denote paths, and mark fragments.

When you need to include these characters as literal data (not structural elements), they must be encoded. Without encoding, the server can't distinguish between a structural & that separates parameters and a literal & that's part of a search query.

The core problem: URLs have a limited character vocabulary. Anything outside that vocabulary — spaces, non-ASCII characters, reserved characters used as data — must be translated into the safe subset using percent encoding.

Which Characters Need Encoding?

RFC 3986 categorizes URL characters into three groups:

Unreserved Characters (Never Encode)

These characters are always safe in URLs and should never be encoded:

A-Z  a-z  0-9  -  _  .  ~

That's 66 characters total. Everything else is a candidate for encoding, depending on context.

Reserved Characters (Context-Dependent)

These characters have special meaning in URL structure. They must be encoded when used as literal data, but left unencoded when serving their structural purpose:

CharacterURL PurposeEncoded Form
!Sub-delimiter%21
#Fragment identifier%23
$Sub-delimiter%24
&Query parameter separator%26
'Sub-delimiter%27
(Sub-delimiter%28
)Sub-delimiter%29
*Sub-delimiter%2A
+Space (in query strings)%2B
,Sub-delimiter%2C
/Path separator%2F
:Port / scheme separator%3A
;Sub-delimiter%3B
=Query key-value separator%3D
?Query string start%3F
@Authority separator%40
[IPv6 literal%5B
]IPv6 literal%5D

Non-ASCII Characters (Always Encode)

Any character outside the ASCII range (code points above 127) must be encoded. This includes accented characters (é, ñ, ü), CJK characters (中, 日, 한), emojis (😀), and symbols (©, €, £). These are first converted to bytes using UTF-8 encoding, then each byte is percent-encoded.

For example, the Euro sign € (U+20AC) becomes %E2%82%AC in UTF-8 — three bytes, each percent-encoded.

URL Encoding in Practice

Query String Encoding

The most common use case for URL encoding is preparing query parameters. Consider a search for "web & API development":

Raw:    https://example.com/search?q=web & API development
Encoded: https://example.com/search?q=web%20%26%20API%20development

Without encoding, the & would be interpreted as a parameter separator, splitting the query into q=web and API development (a separate, malformed parameter). The space would also cause issues — while some servers accept + for spaces in query strings, %20 is universally safe.

Path Segment Encoding

Characters in the path portion of a URL are encoded differently than characters in query strings. Specifically, / must be encoded in path segments (since it separates path components) but + is treated as a literal plus sign (not a space) in paths.

Raw:    https://example.com/files/web & API guide.pdf
Encoded: https://example.com/files/web%20%26%20API%20guide.pdf

Form Data Encoding

When HTML forms submit data with application/x-www-form-urlencoded, all form values are URL-encoded. This is why your browser automatically converts spaces to + and encodes special characters when you submit a form.

Encoding in Programming Languages

JavaScript

// Modern approach (recommended)
const encoded = encodeURIComponent('hello world & foo=bar');
// Result: "hello%20world%20%26%20foo%3Dbar"

// Full URL encoding (includes reserved characters)
const fullEncoded = encodeURI('https://example.com/path?q=test');
// Result: "https://example.com/path?q=test"

// Decoding
const decoded = decodeURIComponent('hello%20world%20%26%20foo%3Dbar');
// Result: "hello world & foo=bar"
Important: Use encodeURIComponent() for individual parameter values, not entire URLs. Use encodeURI() for full URLs — it leaves :/?# unencoded since they're structural. Mixing these up is the single most common URL encoding bug.

Python

from urllib.parse import quote, quote_plus, unquote

# Encode (spaces as %20)
quote('hello world & foo=bar')
# Result: 'hello%20world%20%26%20foo%3Dbar'

# Encode (spaces as +)
quote_plus('hello world & foo=bar')
# Result: 'hello+world+%26+foo%3Dbar'

# Decode
unquote('hello%20world%20%26%20foo%3Dbar')
# Result: 'hello world & foo=bar'

PHP

// Encode
$encoded = urlencode('hello world & foo=bar');
// Result: "hello+world+%26+foo%3Dbar"

// Raw encode (spaces as %20)
$raw = rawurlencode('hello world & foo=bar');
// Result: "hello%20world%20%26%20foo%3Dbar"

// Decode
$decoded = urldecode('hello+world+%26+foo%3Dbar');
// Result: "hello world & foo=bar"

URL Encoding in API Development

URL encoding is critical in API development because APIs frequently handle user-generated data that contains special characters. Here are the key scenarios where encoding matters:

Query Parameter Construction

When building API request URLs with user input, always encode parameter values:

// WRONG — user input with & breaks the URL
const url = `https://api.example.com/search?q=${userInput}`;

// RIGHT — encode the value
const url = `https://api.example.com/search?q=${encodeURIComponent(userInput)}`;

Without encoding, a search for "Tom & Jerry" would generate ?q=Tom & Jerry, which the API would parse as two parameters: q=Tom and Jerry.

Authentication Tokens in URLs

Some APIs pass authentication tokens in URL query parameters. These tokens often contain characters like +, /, and = (especially Base64-encoded tokens) that must be properly encoded:

const token = 'abc+def/ghi=';
const url = `https://api.example.com/data?token=${encodeURIComponent(token)}`;
// Result: ?token=abc%2Bdef%2Fghi%3D

Double Encoding Pitfall

One of the most frustrating bugs in API development is double encoding — encoding a value that's already been encoded. This happens when multiple layers of your application each encode the URL:

// First encoding: space → %20
const first = encodeURIComponent('hello world');
// Result: "hello%20world"

// Second encoding (BUG!): % → %25, 2 → 2, 0 → 0
const second = encodeURIComponent(first);
// Result: "hello%2520world"

// The server now receives "hello%20world" as literal text
// instead of decoding it to "hello world"

The fix is simple: encode once, at the point where you construct the URL. Don't encode values that have already been encoded by a library or middleware.

Path Parameters

REST APIs often include identifiers in the URL path. These path segments must also be encoded:

// Resource name with spaces and special characters
const name = 'project alpha v2.0';
const url = `https://api.example.com/projects/${encodeURIComponent(name)}`;
// Result: https://api.example.com/projects/project%20alpha%20v2.0

Handling Encoding in API Responses

When your API returns URLs as part of JSON responses, ensure those URLs are properly encoded before serialization. Many frameworks handle this automatically, but if you're constructing URLs manually in response payloads, apply encoding consistently.

Common Encoding Mistakes

Frequently Asked Questions

What is the difference between URL encoding and Base64 encoding?

URL encoding (percent encoding) converts individual unsafe characters into %XX sequences while leaving safe characters readable. Base64 encoding converts the entire input into a different character set (A-Z, a-z, 0-9, +, /) using a 3-to-4 byte mapping. URL encoding preserves readability for most of the string; Base64 makes the entire string unreadable. They serve different purposes — URL encoding makes strings safe for URLs, while Base64 converts binary data into a text-safe format.

Should I use encodeURI or encodeURIComponent?

Use encodeURIComponent() for individual parameter values or path segments. It encodes everything that could be unsafe, including /, ?, and &. Use encodeURI() when you have a complete URL and only need to encode the non-structural parts — it preserves :/?#[]@!$&'()*+,;= since those serve structural purposes in URLs.

Why does my API receive %2520 instead of %20?

This is double encoding. Your value was encoded twice: first the space became %20, then the % in %20 was encoded to %25, producing %2520. Fix this by identifying where the extra encoding step occurs — usually in a middleware layer, a form handler, or manual concatenation — and remove the redundant encoding.

How do I encode emojis in URLs?

Emojis are encoded the same way as any non-ASCII character. The emoji 😀 (U+1F600) becomes %F0%9F%98%80 in UTF-8 percent encoding. All modern browsers and server libraries handle this automatically. Use encodeURIComponent('😀') in JavaScript or quote('😀') in Python.

Are URLs with encoded characters SEO-friendly?

Search engines handle URL-encoded characters fine and will index the decoded versions. However, for SEO best practices, prefer human-readable URLs with hyphens instead of encoded characters. Use example.com/products/red-shoes rather than example.com/products/red%20shoes. Most web frameworks provide "slug" functions that convert text into URL-safe, SEO-friendly paths.

What characters must be encoded in a URL path vs query string?

In a URL path, / must be encoded since it separates path segments, but + is a literal character. In a query string, + represents a space (by convention), and / is generally safe as a literal character. Both contexts require encoding &, =, ?, and # when used as data.

How do online URL encoder/decoder tools work?

Online URL encoder/decoder tools run JavaScript in your browser to apply encodeURIComponent() and decodeURIComponent() to your input. The encoding and decoding happen entirely on your device — no data is sent to a server. They're useful for quick checks when you're debugging a URL or verifying that your application's encoding is correct.

Is URL encoding the same as HTML encoding?

No. URL encoding (percent encoding) makes characters safe for URLs using %XX sequences. HTML encoding makes characters safe for HTML markup using named entities (&) or numeric entities (&). They address different problems — one for web addresses, the other for HTML document structure. A fully web-safe application applies both in the appropriate contexts.

URL encoding seems simple on the surface, but the details matter. Encode once, encode at the right layer, and always encode user-supplied values before inserting them into URLs. These three rules will prevent the vast majority of encoding-related bugs in web applications and APIs.

← Back to Risetop