Regex Tester: Master Regular Expressions with Real Examples

Q: Why does my regex work in one language but not another?

Regex engines differ between languages. JavaScript uses a different engine than Python, Java, or PCRE. Lookaheads, backreferences, and Unicode handling vary. Always test your regex in the specific language you're using.

📅 April 13, 2026 ⏱ 10 min read ✍️ Risetop Team

Regular expressions look like line noise. But behind the cryptic syntax lies the most powerful text processing tool available to developers. Every major programming language supports regex, every text editor uses it for search-and-replace, and every data pipeline relies on it for validation and extraction.

This guide doesn't teach regex from first principles. Instead, it walks through 10 real-world case studies — each one a pattern you'll actually use in production, complete with the regex, explanation, test cases, and common pitfalls.

🧪 Test every pattern below interactively — paste, match, and debug in real time.

Open Regex Tester

CASE 1 — Validation

Email Address Validation

The most common regex use case. The challenge: email addresses have a complex spec (RFC 5322), but you need a practical pattern that catches errors without rejecting valid addresses.

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

How it works: The local part allows letters, digits, dots, underscores, percent signs, pluses, and hyphens. The domain part allows letters, digits, dots, and hyphens. The TLD must be at least 2 characters.

Matches: user@example.com, john.doe+tag@company.co.uk, admin@sub.domain.org

Rejects: user@ (no domain), @example.com (no local part), user@.com (no domain name)

Gotcha: This pattern rejects technically valid but unusual addresses like " "@example.org. For most applications, this trade-off is acceptable. Use an HTML5 <input type="email"> for client-side validation as a complement.

CASE 2 — Validation

Phone Number Extraction (International)

Extracting phone numbers from free text is harder than it looks. People write them in dozens of formats: with parentheses, dashes, dots, spaces, country codes, and extensions.

(?:\+?(\d{1,3}))?[-. (]*(\d{3})[-. )]*(\d{3})[-. ]*(\d{4})(?: *x(\d+))?

How it works: Optional country code with +, followed by area code and number in flexible formats. The extension is captured separately.

Matches: +1 (555) 123-4567, 555.123.4567, 555-123-4567 x89, +44 20 7946 0958

Tip: For production use, consider Google's libphonenumber library, which handles every country's numbering plan correctly.

CASE 3 — Extraction

URL Extraction from Text

Given a block of text, extract all URLs — whether they use http, https, or have no protocol at all.

https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)

How it works: Matches http or https, optional www., then the domain and path. The character class for the path includes query parameters and fragments.

Matches: https://example.com, http://www.example.com/path?q=1, https://api.example.co.uk/v2/users?page=1&limit=20

Gotcha: This pattern may match trailing punctuation as part of the URL (e.g., the period at the end of a sentence). Post-process matches to strip trailing dots, commas, and parentheses.

CASE 4 — Validation

Date Format Validation (YYYY-MM-DD)

Validate ISO 8601 dates. The regex catches format errors; application logic should handle semantic errors (like February 30).

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

How it works: Year is exactly 4 digits. Month is 01-12. Day is 01-31. This accepts technically invalid dates like 2026-02-30, which your application should reject separately.

Matches: 2026-04-13, 1999-12-31, 2000-01-01

Rejects: 2026-13-01 (invalid month), 2026-4-1 (not zero-padded), 26-04-13 (2-digit year)

CASE 5 — Validation

Password Strength Check

Validate that a password meets complexity requirements: at least 8 characters, with uppercase, lowercase, digit, and special character.

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

How it works: Each (?=.*[X]) is a lookahead that ensures at least one character of the specified type exists somewhere in the string. The final part matches the actual characters.

Matches: Passw0rd!, My$ecure1, Complexity#9

Rejects: password (no upper, digit, special), Passw0rd (no special), Pw1! (too short)

Tip: For better UX, check each requirement separately and show users which ones they've met, rather than one monolithic pass/fail.

CASE 6 — Extraction

IP Address Extraction (IPv4)

Extract IPv4 addresses from logs, configuration files, or network data.

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

How it works: Each octet is 0-255. The pattern uses alternation to handle the 200-255 range, 100-199 range, and 0-99 range separately.

Matches: 192.168.1.1, 10.0.0.255, 255.255.255.0

Rejects: 256.1.1.1 (octet > 255), 1.2.3 (only 3 octets)

CASE 7 — Cleaning

Remove HTML Tags from Text

Strip all HTML tags from a string, leaving only the text content. Useful for sanitizing user input or creating plain-text versions of HTML content.

<[^>]*>

How it works: Matches anything between < and >. Replace with empty string.

Input: <p>Hello <strong>world</strong>!</p>

Output: Hello world!

Warning: This doesn't handle <script> content correctly (the JS code between tags won't be removed) and fails on malformed HTML. For robust HTML stripping, use a DOM parser.

CASE 8 — Parsing

Extract Log Timestamps

Extract ISO 8601 timestamps from server logs for analysis.

\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?(?:Z|[+-]\d{2}:?\d{2})

How it works: Matches the date, T separator, time, optional milliseconds, and timezone (Z or offset).

Matches: 2026-04-13T06:30:00Z, 2026-04-13T14:30:00.123+08:00, 2026-04-13T06:30:00-05:00

CASE 9 — Validation

Hex Color Code Validation

Validate CSS hex color codes in 3-digit or 6-digit format, with or without the # prefix.

#?([0-9a-fA-F]{3}){1,2}\b

How it works: Optional #, then either 3 hex digits (shorthand) or 6 hex digits (full). The word boundary prevents matching partial hex codes.

Matches: #fff, #336699, ABCDEF

Rejects: #ggg (invalid hex), #1234 (neither 3 nor 6 digits)

CASE 10 — Replacement

Credit Card Number Masking

Replace all but the last 4 digits of a credit card number with asterisks for PCI-DSS compliance.

\b(\d{4})[- ]?(\d{4})[- ]?(\d{4})[- ]?(\d{4})\b

Replacement: ****-****-****-$4

Input: Card: 4111-1111-1111-1234

Output: Card: ****-****-****-1234

How it works: Each group captures 4 digits with optional separators. The replacement references only the last group ($4) and hardcodes asterisks for the rest.

Regex Pattern Library: Quick Reference

Pattern	Regex
Username (3-16 chars)	`^[a-zA-Z0-9_]{3,16}$`
Slug / URL-safe string	`^[a-z0-9]+(?:-[a-z0-9]+)*$`
Hexadecimal number	`^0x[0-9a-fA-F]+$`
Semantic version	`^\d+\.\d+\.\d+(?:-[\w.]+)?$`
UUID v4	`^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$`
Strong password	`^(?=.[a-z])(?=.[A-Z])(?=.*\d).{8,}$`
MAC address	`^([0-9a-fA-F]{2}:){5}[0-9a-fA-F]{2}$`

Debugging Tips: When Your Regex Doesn't Work

Test incrementally. Start with a minimal pattern and add complexity one piece at a time. If \d+ works but \d{3}-\d{4} doesn't, you know the issue is with the separator.
Use an online tester. Our Regex Tester highlights matches in real time and shows exactly which part of the string each token matches.
Watch out for greedy matching. .* matches as much as possible. Use .*? (lazy) to match as little as possible. This is the #1 source of unexpected regex behavior.
Escape special characters. Characters like . * + ? ^ $ { } [ ] \ | ( ) have special meaning in regex. To match them literally, escape with \.
Check your language's flavor. JavaScript, Python, Java, and PCRE all have slightly different regex features. Lookaheads, named groups, and Unicode support vary.

🔧 Stop guessing — test your regex patterns live with instant match highlighting.

Test Your Regex Now

Frequently Asked Questions

What is a regular expression?

A regular expression (regex) is a sequence of characters that defines a search pattern. It's used to match, find, and replace text in strings. Most programming languages (JavaScript, Python, Java, C#, Go) have built-in regex support through their standard libraries.

Why does my regex work in one language but not another?

Regex engines differ between languages and implementations. JavaScript uses a different engine than Python (re module), Java (java.util.regex), or PCRE. Features like lookbehinds, named groups, Unicode categories, and backreference behavior vary. Always test your regex in the specific language you're targeting.

How do I test a regex pattern?

Use an online regex tester like Risetop's Regex Tester. Paste your regex pattern and test string, then see real-time match highlighting with group captures. This is much faster than writing code to test each iteration.

Should I use regex for HTML parsing?

Generally no. HTML is not a regular language — nested tags, comments, attributes, and CDATA sections make it too complex for regex to handle reliably. Use a proper HTML parser like BeautifulSoup (Python), DOMParser (JavaScript), or Jsoup (Java) instead. Regex is fine for simple extraction from well-structured snippets.

What are regex capture groups?

Capture groups are parts of a regex enclosed in parentheses () that extract matched substrings for later use. You can reference them by number (\1, \2) or by name (?<name>...) in replacement patterns, backreferences, or programmatic access. Non-capturing groups (?:...) group without capturing.