Regex Tutorial for Beginners: A Complete Guide 2026

Learn regular expressions from zero. This guide covers every essential pattern, with clear examples you can use immediately.

Developer Tools 2026-04-09 By Risetop Team 14 min read

Regular expressions (regex) look like gibberish to the uninitiated — a jumble of backslashes, brackets, and cryptic symbols. But once you learn the basics, regex becomes one of the most powerful tools in your programming toolkit. It lets you validate emails, extract data from logs, search through files, and transform text in ways that would take dozens of lines of code otherwise.

This tutorial starts from absolute zero and builds up to practical, real-world patterns. No prior regex knowledge needed.

What Is Regex?

A regular expression is a pattern that describes a set of strings. Think of it as a search query on steroids. Instead of searching for an exact string like "hello", you can search for "any word that starts with h and ends with o" using the pattern h\w*o.

Regex is supported by virtually every programming language (JavaScript, Python, Java, Go, Rust, PHP, and more), most text editors (VS Code, Sublime Text), command-line tools (grep, sed, awk), and many online tools.

Basic Matching

The simplest regex is just plain text. The pattern hello matches the string "hello" and nothing else. Everything starts from here.

Regex is case-sensitive by default. Hello and hello are different patterns. Most languages have a case-insensitive flag (usually i) to match regardless of case.

Metacharacters: The Building Blocks

Metacharacters are special characters that have meaning in regex beyond their literal value. Here are the ones you'll use most:

CharacterMeaningExample
.Any single character (except newline)h.llo matches "hello", "hallo", "h3llo"
^Start of string^Hello matches "Hello" only at the start
$End of stringworld$ matches "world" only at the end
*Zero or more of the previousab*c matches "ac", "abc", "abbc"
+One or more of the previousab+c matches "abc", "abbc" (not "ac")
?Zero or one of the previouscolou?r matches "color" and "colour"
\Escape special characters\. matches a literal dot
|OR operatorcat|dog matches "cat" or "dog"
()Grouping(ab)+ matches "ab", "abab", "ababab"
[]Character class[aeiou] matches any vowel
{}Quantifier (exact count)a{3} matches "aaa"

Character Classes

Character classes let you match one character from a set of options:

PatternMatches
[abc]Any one of a, b, or c
[a-z]Any lowercase letter (a through z)
[A-Z]Any uppercase letter
[0-9]Any digit
[a-zA-Z0-9]Any letter or digit (alphanumeric)
[^abc]Any character EXCEPT a, b, or c

There are also shorthand character classes that are more concise:

ShorthandEquivalentMatches
\d[0-9]Any digit
\D[^0-9]Any non-digit
\w[a-zA-Z0-9_]Any word character
\W[^a-zA-Z0-9_]Any non-word character
\s[ \t\n\r\f\v]Any whitespace
\S[^ \t\n\r\f\v]Any non-whitespace

Quantifiers: How Many Times?

Quantifiers specify how many times a pattern should match:

QuantifierMeaningExample
*Zero or morea* matches "", "a", "aaa"
+One or morea+ matches "a", "aaa" (not "")
?Zero or onea? matches "" or "a"
{n}Exactly na{3} matches "aaa"
{n,}n or morea{2,} matches "aa", "aaa", "aaaa"
{n,m}Between n and ma{2,4} matches "aa", "aaa", "aaaa"
💡 Greedy vs Lazy: By default, quantifiers are greedy — they match as much as possible. Add ? to make them lazy (match as little as possible). For example, <.*> matches the entire string <div>hello</div>, but <.*?> matches just <div>.

Groups and Capturing

Parentheses create groups. Groups serve two purposes: applying quantifiers to multiple characters, and capturing matched text for later use.

// Capturing group (\d{3})-(\d{3})-(\d{4}) // Matches "123-456-7890" // Group 1: "123", Group 2: "456", Group 3: "7890" // Non-capturing group (?:abc)+ // matches "abc", "abcabc" but doesn't capture // Named group (most modern engines) (?<area>\d{3})-(?<prefix>\d{3})-(?<line>\d{4})

Anchors and Boundaries

Anchors don't match characters — they match positions in the string:

// Match "cat" as a whole word, not inside "catalog" \bcat\b // Match a string that is exactly a 5-digit number ^\d{5}$

Lookahead and Lookbehind

These are advanced patterns that match based on what comes before or after the current position, without including that text in the match:

// Positive lookahead: match "apple" only if followed by "pie" apple(?= pie) // Negative lookahead: match "apple" only if NOT followed by "pie" apple(?! pie) // Positive lookbehind: match "pie" only if preceded by "apple" (?<=apple )pie // Negative lookbehind: match "pie" only if NOT preceded by "apple" (?<!apple )pie

Flags (Modifiers)

Flags change how the regex engine interprets the pattern:

FlagEffect
iCase-insensitive matching
gGlobal (find all matches, not just the first)
mMultiline (^ and $ match line starts/ends)
sDotall (. matches newlines too)
uUnicode mode (proper handling of Unicode)
xExtended (allows whitespace and comments in pattern)

Practical Examples

Email Validation

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This matches most common email formats. It's not perfect (email validation is notoriously complex), but it covers 99% of real-world addresses.

URL Matching

https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/\S*)?

Matches http and https URLs with optional paths.

Phone Number (US)

\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

Matches formats like (123) 456-7890, 123-456-7890, 123.456.7890, and 1234567890.

Password Validation

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&]).{8,}$

Requires at least 8 characters with at least one lowercase, one uppercase, one digit, and one special character.

Extract Numbers from Text

\b\d+(?:\.\d{2})?\b

Matches whole numbers and numbers with exactly two decimal places (useful for prices).

Test your regex patterns in real-time with instant match highlighting.

Try Regex Tester →

Regex in Programming Languages

JavaScript

// Creating a regex const pattern = /\d{3}-\d{4}/g; // Testing pattern.test("Call 555-1234") // true // Matching const str = "Call 555-1234 or 555-5678"; str.match(pattern) // ["555-1234", "555-5678"] // Replacing "Hello 123 World 456".replace(/\d+/g, "###") // "Hello ### World ###"

Python

import re # Searching result = re.search(r'\d{3}-\d{4}', "Call 555-1234") result.group() # "555-1234" # Finding all matches re.findall(r'\d+', "Order 123, 456, 789") # ["123", "456", "789"] # Substitution re.sub(r'\d+', '###', "Hello 123 World 456") # "Hello ### World ###"

Tips for Writing Better Regex

  1. Start simple and build up. Write the pattern incrementally. Test each addition before making the next change.
  2. Use raw strings in Python. Prefix with r to avoid escape sequence issues: r"\d+" instead of "\\d+".
  3. Comment complex patterns. Use the x flag to add comments and whitespace to make your regex readable.
  4. Test with edge cases. Always test with empty strings, very long strings, and unexpected input.
  5. Use an online tester. Tools like Risetop's regex tester let you experiment with patterns and see matches highlighted in real-time.

When NOT to Use Regex

Regex is powerful, but it's not the right tool for everything:

Practice regex with real-time feedback — paste a pattern, see the matches instantly.

Open Regex Tester →

Conclusion

Regular expressions are one of the most useful skills you can develop as a developer. They're not as scary as they look — once you understand the core concepts (metacharacters, character classes, quantifiers, groups, and anchors), you can read and write most patterns with confidence.

Start with the patterns in this guide, practice with an online regex tester, and gradually build your pattern library. Before long, regex will feel like second nature — and you'll wonder how you ever worked without it.