The Short Answer
Use SHA-256 for anything security-sensitive: passwords, digital signatures, file integrity verification, certificates. MD5 is only appropriate for non-security purposes like cache keys, deduplication fingerprints, and legacy compatibility checks where collision resistance doesn't matter.
If that's all you needed, great. But understanding why makes you a better engineer — and helps you spot when MD5 is being misused in your own codebase.
How Hash Algorithms Work
Both MD5 and SHA-256 are cryptographic hash functions. They take an input of arbitrary length and produce a fixed-length output (a "digest" or "hash"). The same input always produces the same output. Changing even a single bit flips roughly half the output bits — this is the avalanche effect.
The critical property for security is collision resistance: it should be computationally infeasible to find two different inputs that produce the same hash. This is where MD5 falls apart and SHA-256 holds strong.
Head-to-Head Comparison
| Property | MD5 | SHA-256 |
|---|---|---|
| Output Length | 128 bits (32 hex chars) | 256 bits (64 hex chars) |
| Designed | 1992 (Ron Rivest) | 2001 (NSA) |
| Collision Attacks | Broken since 2004 | No practical attacks known |
| Speed | ~2x faster than SHA-256 | Baseline |
| Use in SSL/TLS | Removed (TLS 1.2+) | Standard |
| Password Hashing | Never acceptable | Use with salt + key stretching |
The MD5 Collision Problem
In 2004, Xiaoyun Wang demonstrated a practical collision attack on MD5. By 2008, researchers used MD5 collisions to create a rogue CA certificate — they generated a certificate that had the same MD5 hash as a legitimate one signed by a trusted authority. This meant a man-in-the-middle attacker could present a forged, "trusted" certificate.
The lesson: MD5's 128-bit output isn't enough space to prevent collisions at scale. With birthday-paradox math, you'd expect collisions after about 2^64 operations — but cryptanalytic advances reduced this to roughly 2^18, achievable in seconds on a modern laptop.
What "Broken" Actually Means
"Broken" doesn't mean you can reverse an MD5 hash to find the original input (preimage resistance still holds). It means an attacker can craft two files with the same hash. For a file download integrity check where the attacker controls neither the file nor the published hash, MD5 is technically still fine. But why take the risk when SHA-256 is everywhere?
When MD5 Is Still Fine
Despite its cryptographic weaknesses, MD5 remains widely used in non-security contexts:
- Cache keys: Redis, Memcached, and CDN edge caches use MD5 to fingerprint cache entries. The key space just needs to be large enough to avoid accidental collisions, not resist adversarial ones.
- Content deduplication: Storing files indexed by MD5 hash to avoid duplicates. Even if a collision occurs, the worst case is storing a duplicate — not a security breach.
- Etag generation: HTTP ETags for conditional requests. Browsers don't attack your ETags.
- Quick fingerprinting in logs: Hashing user IDs or IP addresses for anonymized logging where the hash isn't used for authentication.
When You Must Use SHA-256
- Password storage: Always hash passwords with SHA-256 (or better, bcrypt/Argon2) plus a unique salt. Never use plain MD5 or SHA-256 without salt and key stretching.
- Digital signatures: Code signing, TLS certificates, software distribution — all use SHA-256 as the default hash in their signature algorithms.
- Blockchain: Bitcoin, Ethereum, and most blockchains use SHA-256 (or double-SHA-256) as their core hashing primitive.
- File integrity for downloads: When you publish a hash for users to verify, use SHA-256. It's the de facto standard, and every OS ships tools that generate it.
- API authentication: HMAC-SHA256 is the standard for signing API requests (AWS, Stripe, most SaaS platforms).
Generating Hashes in Practice
In the terminal:
# MD5
md5sum myfile.zip
# Output: d41d8cd98f00b204e9800998ecf8427e myfile.zip
# SHA-256
sha256sum myfile.zip
# Output: e3b0c44298fc1c149afbf4c8996fb924... myfile.zip
In Python:
import hashlib
data = b"Hello, World!"
md5_hash = hashlib.md5(data).hexdigest()
sha256_hash = hashlib.sha256(data).hexdigest()
print(f"MD5: {md5_hash}")
print(f"SHA-256: {sha256_hash}")
Or skip the code entirely — paste your input into [RiseTop's hash generator](/tools/file-hash-calculator.html) and get instant MD5, SHA-1, SHA-256, and SHA-512 results. It handles both text input and file uploads.
Performance: Does It Matter?
SHA-256 is roughly 2x slower than MD5 on the same hardware. For most applications, this difference is negligible. You're talking about microseconds per hash. Even hashing a 1 GB file takes only a few seconds longer with SHA-256.
The one scenario where speed matters is password hashing — but in that case, you want it to be slow. That's why bcrypt and Argon2 exist: they're intentionally designed to be computationally expensive, making brute-force attacks impractical. Using raw SHA-256 for passwords is better than MD5 but still not ideal.
Key Takeaways
- MD5 is cryptographically broken for collision resistance. Don't use it for security.
- SHA-256 is the default choice for any security-related hashing in 2026.
- MD5 is still useful for non-adversarial contexts like caching, deduplication, and logging.
- For passwords, skip both and use bcrypt or Argon2 with per-user salts.