Published on April 11, 2026 · 6 min read
If you've ever written /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/ and called it a day, you're in good company. Most developers treat email validation as a regex problem. It's not — it's a systems problem. Understanding how email actually works will change how you validate it.
The RFC 5322 specification for email addresses allows syntax that would make most regex patterns choke. Valid addresses include quotes, comments, IP address literals, and unusual domain formats. But the real problem isn't matching the spec — it's that a syntactically valid email address tells you almost nothing about whether it can actually receive mail.
In production systems, we've seen regex validation pass for addresses that:
gmial.com instead of gmail.com)Robust email validation happens in layers, each catching different categories of problems.
Don't try to match RFC 5322 perfectly. Instead, validate what's practically useful:
import re
def is_valid_syntax(email):
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
if not re.match(pattern, email):
return False
local, domain = email.rsplit('@', 1)
if len(local) > 64 or len(domain) > 255:
return False
return True
This catches obvious typos without rejecting valid-but-unusual addresses. Tools like RiseTop's Email Validator perform this check instantly along with deeper verification layers.
After syntax, check if the domain can actually receive email by querying its MX (Mail Exchange) records:
import dns.resolver
def has_valid_mx(domain):
try:
records = dns.resolver.resolve(domain, 'MX')
return len(records) > 0
except (dns.resolver.NoAnswer, dns.resolver.NXDOMAIN):
return False
No MX records means the domain can't receive email. This single check eliminates a huge percentage of invalid addresses — typos in domain names, made-up domains, and expired domains all fail here.
The most thorough check: actually connect to the mail server and ask if the mailbox exists. This involves opening an SMTP connection, initiating a dialogue, and issuing a RCPT TO command.
import smtplib
def verify_smtp(email, from_addr="verify@example.com"):
domain = email.split('@')[1]
try:
records = dns.resolver.resolve(domain, 'MX')
mx_host = str(records[0].exchange)
server = smtplib.SMTP(timeout=10)
server.connect(mx_host)
server.helo("verify.example.com")
server.mail(from_addr)
code, message = server.rcpt(email)
server.quit()
return code == 250
except Exception:
return False
Many invalid emails are just typos. A practical approach is to maintain a list of common domain misspellings and suggest corrections:
gmial.com → gmail.comyaho.com → yahoo.comhotmal.com → hotmail.comoutlok.com → outlook.comUsing edit distance algorithms (like Levenshtein distance) with a threshold of 2, you can automatically catch and suggest corrections for the vast majority of domain typos.
If you're running a service where email uniqueness matters (signups, trial accounts, notifications), disposable email addresses are a real problem. Services like 10minutemail, Guerrilla Mail, and similar providers generate temporary addresses that pass all validation checks but are useless for long-term communication.
The solution is maintaining a blocklist of known disposable email domains. Several open-source lists exist (like the disposable-email-domains GitHub repository) that are regularly updated with hundreds of disposable email providers.
user+tag@gmail.com) is valid and widely used. Rejecting the + character breaks a legitimate feature.Email validation is a spectrum. For a simple contact form, syntax + typo detection is sufficient. For a SaaS signup, add DNS verification. For payment processing, add SMTP probing. Match your validation depth to the risk and cost of a bad email address entering your system.