Reusing a password that has appeared in a data breach is one of the most reliable ways to get an account taken over. Attackers collect billions of leaked username-password pairs and replay them across other services in what's called credential stuffing. If your breached password from one site is also your password somewhere else, the attacker walks in. Checking your passwords against known breaches is therefore genuinely useful — but the check itself looks like a privacy catastrophe waiting to happen.
The breakthrough was figuring out how to answer "is this specific password in the breach corpus?" without the asking party ever transmitting the password, and without the answering server learning which password was being queried. The mechanism behind services like Have I Been Pwned's Pwned Passwords is worth understanding in detail, because it is a clean example of designing privacy into a protocol instead of promising it in a policy.
Step One: Never Send the Password
The client never sends the password. It hashes it locally first. In the original Pwned Passwords design, the password is run through SHA-1 to produce a 40-character hexadecimal hash. (SHA-1 is broken for collision resistance, but here it functions only as a fast, consistent fingerprint, not as a security boundary — the design does not rely on SHA-1 being collision-proof.)
A hash alone, though, isn't enough. If the client sent the full hash, the server would learn exactly which password was being checked — a full hash of a known password is trivially reversible by lookup. So the protocol sends only a fragment of it.
The client splits the hash into a short prefix and a longer suffix, and sends only the prefix. The server can't tell which of the many passwords sharing that prefix you're actually asking about. Your query hides in a crowd.
Step Two: The Range Query
The client takes the first 5 hexadecimal characters of the hash — the prefix — and sends only those to the server. Five hex characters define a "range," and the server responds with the suffixes of every breached hash that begins with that prefix, along with how many times each was seen in breaches.
Because a 5-character hex prefix has over a million possible values spread across hundreds of millions of hashes, any given prefix matches a sizable bucket of candidate hashes — typically hundreds. The server returns the whole bucket. It has no way to know which entry, if any, you care about.
- Client computes
SHA1(password), e.g.5BAA61E4... - Client sends the first 5 chars:
5BAA6 - Server returns all suffixes (chars 6–40) for hashes starting
5BAA6, each with a breach count - Client scans that list locally for its own suffix
- A match means the password is breached, and the count shows how exposed it is — all determined on the client
Why This Counts as k-Anonymity
The "k" in k-anonymity refers to crowd size: your real query is indistinguishable from at least k−1 others. Here, the server sees only a prefix shared by a large set of possible passwords, so from its perspective your request is identical to anyone else querying any password in that bucket. It learns the bucket, never the item.
| What the server learns | What the server never learns |
|---|---|
| A 5-character hash prefix shared by hundreds of candidate hashes | Your password |
| That someone queried that prefix range | Which specific hash in the range you were checking |
| Approximate timing of the request | Whether you got a match — the comparison happens on your device |
The protocol's elegance is that the privacy doesn't depend on trusting the server. Even a fully malicious server logging every request cannot recover the password from a prefix that fits hundreds of possibilities. — Why this is a protocol, not a promise
This is the same family of reasoning behind private information retrieval and differential privacy: structure the interaction so that the information you want to protect is mathematically unavailable to the other party, rather than merely promised to be unused.
The Limits to Keep in Mind
It's a strong design, but it answers exactly one question and no more:
- It tells you a password is common in breaches, not that your account leaked. A match means the string has appeared somewhere, by anyone; it isn't proof your specific account was compromised.
- A "not found" is not a clean bill of health. It only means the password isn't in this corpus. It could still be weak, guessable, or breached in a dataset the service doesn't have.
- You must trust the client. The privacy guarantee holds only if your password manager or browser implements the range query correctly and doesn't send the full hash. Reputable implementations do; verify before trusting a random website.
The right response to a match is not panic but rotation: change that password everywhere it's used, and stop reusing it. Better still, never reuse passwords in the first place, so a single breach can never cascade. This is the core defense against account takeover attacks, and a good password manager makes it effortless by generating a unique random password per site.
Where Haven Fits
We like the k-anonymity range query because it embodies a principle Haven is built around: don't ask users to trust that a service won't misuse their secrets — design the system so the service never holds the secret to begin with. A breach checker that can't learn your password is the same instinct as an encrypted messenger that can't read your messages.
Haven derives your encryption keys on your own device, so your passphrase never travels to our servers — not even as a hash of itself. Whether it's checking a leaked password or sending a private message, the strongest guarantee is the one that doesn't depend on anyone's good behavior.