Regular expressions (regex) look like gibberish to the uninitiated — a jumble of backslashes, brackets, and cryptic symbols. But once you learn the basics, regex becomes one of the most powerful tools in your programming toolkit. It lets you validate emails, extract data from logs, search through files, and transform text in ways that would take dozens of lines of code otherwise.
This tutorial starts from absolute zero and builds up to practical, real-world patterns. No prior regex knowledge needed.
What Is Regex?
A regular expression is a pattern that describes a set of strings. Think of it as a search query on steroids. Instead of searching for an exact string like "hello", you can search for "any word that starts with h and ends with o" using the pattern h\w*o.
Regex is supported by virtually every programming language (JavaScript, Python, Java, Go, Rust, PHP, and more), most text editors (VS Code, Sublime Text), command-line tools (grep, sed, awk), and many online tools.
Basic Matching
The simplest regex is just plain text. The pattern hello matches the string "hello" and nothing else. Everything starts from here.
Regex is case-sensitive by default. Hello and hello are different patterns. Most languages have a case-insensitive flag (usually i) to match regardless of case.
Metacharacters: The Building Blocks
Metacharacters are special characters that have meaning in regex beyond their literal value. Here are the ones you'll use most:
| Character | Meaning | Example |
|---|---|---|
. | Any single character (except newline) | h.llo matches "hello", "hallo", "h3llo" |
^ | Start of string | ^Hello matches "Hello" only at the start |
$ | End of string | world$ matches "world" only at the end |
* | Zero or more of the previous | ab*c matches "ac", "abc", "abbc" |
+ | One or more of the previous | ab+c matches "abc", "abbc" (not "ac") |
? | Zero or one of the previous | colou?r matches "color" and "colour" |
\ | Escape special characters | \. matches a literal dot |
| | OR operator | cat|dog matches "cat" or "dog" |
() | Grouping | (ab)+ matches "ab", "abab", "ababab" |
[] | Character class | [aeiou] matches any vowel |
{} | Quantifier (exact count) | a{3} matches "aaa" |
Character Classes
Character classes let you match one character from a set of options:
| Pattern | Matches |
|---|---|
[abc] | Any one of a, b, or c |
[a-z] | Any lowercase letter (a through z) |
[A-Z] | Any uppercase letter |
[0-9] | Any digit |
[a-zA-Z0-9] | Any letter or digit (alphanumeric) |
[^abc] | Any character EXCEPT a, b, or c |
There are also shorthand character classes that are more concise:
| Shorthand | Equivalent | Matches |
|---|---|---|
\d | [0-9] | Any digit |
\D | [^0-9] | Any non-digit |
\w | [a-zA-Z0-9_] | Any word character |
\W | [^a-zA-Z0-9_] | Any non-word character |
\s | [ \t\n\r\f\v] | Any whitespace |
\S | [^ \t\n\r\f\v] | Any non-whitespace |
Quantifiers: How Many Times?
Quantifiers specify how many times a pattern should match:
| Quantifier | Meaning | Example |
|---|---|---|
* | Zero or more | a* matches "", "a", "aaa" |
+ | One or more | a+ matches "a", "aaa" (not "") |
? | Zero or one | a? matches "" or "a" |
{n} | Exactly n | a{3} matches "aaa" |
{n,} | n or more | a{2,} matches "aa", "aaa", "aaaa" |
{n,m} | Between n and m | a{2,4} matches "aa", "aaa", "aaaa" |
? to make them lazy (match as little as possible). For example, <.*> matches the entire string <div>hello</div>, but <.*?> matches just <div>.
Groups and Capturing
Parentheses create groups. Groups serve two purposes: applying quantifiers to multiple characters, and capturing matched text for later use.
Anchors and Boundaries
Anchors don't match characters — they match positions in the string:
^— Start of string (or line, with themflag)$— End of string (or line, with themflag)\b— Word boundary (between a word character and a non-word character)\B— Non-word boundary
Lookahead and Lookbehind
These are advanced patterns that match based on what comes before or after the current position, without including that text in the match:
Flags (Modifiers)
Flags change how the regex engine interprets the pattern:
| Flag | Effect |
|---|---|
i | Case-insensitive matching |
g | Global (find all matches, not just the first) |
m | Multiline (^ and $ match line starts/ends) |
s | Dotall (. matches newlines too) |
u | Unicode mode (proper handling of Unicode) |
x | Extended (allows whitespace and comments in pattern) |
Practical Examples
Email Validation
This matches most common email formats. It's not perfect (email validation is notoriously complex), but it covers 99% of real-world addresses.
URL Matching
Matches http and https URLs with optional paths.
Phone Number (US)
Matches formats like (123) 456-7890, 123-456-7890, 123.456.7890, and 1234567890.
Password Validation
Requires at least 8 characters with at least one lowercase, one uppercase, one digit, and one special character.
Extract Numbers from Text
Matches whole numbers and numbers with exactly two decimal places (useful for prices).
Test your regex patterns in real-time with instant match highlighting.
Try Regex Tester →Regex in Programming Languages
JavaScript
Python
Tips for Writing Better Regex
- Start simple and build up. Write the pattern incrementally. Test each addition before making the next change.
- Use raw strings in Python. Prefix with
rto avoid escape sequence issues:r"\d+"instead of"\\d+". - Comment complex patterns. Use the
xflag to add comments and whitespace to make your regex readable. - Test with edge cases. Always test with empty strings, very long strings, and unexpected input.
- Use an online tester. Tools like Risetop's regex tester let you experiment with patterns and see matches highlighted in real-time.
When NOT to Use Regex
Regex is powerful, but it's not the right tool for everything:
- HTML/XML parsing: Use a proper parser. HTML is not regular — regex will break on nested structures.
- Complex validation: For things like email validation, use a dedicated library instead of a single regex.
- Performance-critical code: Badly written regex can cause catastrophic backtracking. Profile before using regex in hot paths.
Practice regex with real-time feedback — paste a pattern, see the matches instantly.
Open Regex Tester →Conclusion
Regular expressions are one of the most useful skills you can develop as a developer. They're not as scary as they look — once you understand the core concepts (metacharacters, character classes, quantifiers, groups, and anchors), you can read and write most patterns with confidence.
Start with the patterns in this guide, practice with an online regex tester, and gradually build your pattern library. Before long, regex will feel like second nature — and you'll wonder how you ever worked without it.