CRIME and BREACH: How Compression Leaks Encrypted Secrets

Encryption hides what your data says. It does not hide how long it is. CRIME and BREACH are two closely related attacks that turn that one unhidden fact — length — into a key that unlocks the secret. By inducing a victim's browser to send guesses and watching the compressed size of the encrypted response wobble, an attacker can pull a session cookie or a CSRF token out of an HTTPS connection one byte at a time, without ever breaking the cipher.

Most attacks on encryption try to break the math: factor the key, exploit a flawed cipher, find a padding oracle. CRIME and BREACH do something more unsettling. They leave the cryptography completely intact and attack a property the cryptography was never designed to hide: the length of the ciphertext. The cipher is perfect. The length is the leak.

To see how that's possible, you need one fact about compression. Compression algorithms like DEFLATE — the engine behind gzip — shrink data by replacing repeated strings with short back-references. The more a chunk of text repeats something that appeared earlier, the smaller the compressed output. That is the whole vulnerability. Redundancy makes things smaller, and an attacker can manufacture redundancy on purpose.

The Core Trick: Inject a Guess, Measure the Shrink

Imagine an encrypted request that contains a secret you cannot see — say a session cookie, secret=A7F2… — and also contains some text the attacker controls. Now suppose the attacker can make the victim's browser include a guess in that controlled text. If the guess matches the start of the secret, the compressor notices the repetition and the compressed output gets a little smaller. If the guess is wrong, no extra repetition, no extra shrink.

The attacker cannot read the encrypted bytes, but they can measure their length on the wire. Guess secret=A — does the message shrink? Try secret=B, secret=C, and so on. The guess that produces the smallest output is almost certainly correct. Recover one character, then move to the next. Byte by byte, a secret hidden inside a perfectly encrypted channel falls out.

The one-sentence version

When you compress attacker-influenced data together with a secret in the same context, the compressed length depends on how well the attacker's guess matches the secret — and length is visible even when content is encrypted.

CRIME: Attacking TLS Compression

CRIME — "Compression Ratio Info-leak Made Easy" — was demonstrated by Juliano Rizzo and Thai Duong in 2012. It targeted compression applied at the TLS layer itself. When TLS compression was enabled, the entire request, including secret cookies, was compressed before encryption. An attacker running malicious JavaScript in the victim's browser could trigger requests with chosen content and, by observing the size of the encrypted records, run the byte-by-byte guessing attack against the victim's cookies.

CRIME's mitigation was decisive and total: disable TLS-level compression. Browsers and servers turned it off across the board, and modern TLS, including TLS 1.3, does not offer record-layer compression at all. As a transport-layer problem, CRIME is effectively closed.

BREACH: The Same Idea, One Layer Up

BREACH, presented by Yoel Gluck, Neal Harris, and Angelo Prado in 2013, is the harder sibling. Instead of TLS compression, it exploits HTTP response compression — the gzip that web servers apply to HTML responses for performance. This compression happens at the application layer, above TLS, and turning it off entirely would cripple web performance, so the easy CRIME fix does not apply.

BREACH works when two conditions hold in a single HTTP response: the response reflects some attacker-supplied input (for example, a search term echoed back in the page) and the response contains a secret (for example, a CSRF token embedded in the HTML). Because both sit in the same gzip-compressed body, the attacker can inject guesses through the reflected input and watch the compressed response size to extract the secret token, character by character, using the identical principle as CRIME.

CRIME and BREACH share one DNA strand: secret and attacker-controlled data, compressed together, in a channel where only length is observable. Change the layer and the same attack reappears.

Why BREACH Is Hard to Kill

You cannot simply disable HTTP compression without paying a real performance tax, so defending against BREACH means breaking one of the conditions it depends on, usually in combination:

Separate secrets from reflected input — don't compress attacker-controllable content in the same response as a secret token.
Mask the token per request — XOR the CSRF token with a fresh random value on every page load so its compressed representation never repeats in a guessable way. This is the most common practical defense.
Add length randomization — pad responses with a random amount of data so size measurements become noisy. This raises the attacker's cost rather than eliminating the leak.
Rate-limit and monitor — the attack needs many requests to converge; throttling and anomaly detection make it slower and louder.

None of these is a clean, universal "turn it off" fix like CRIME's. BREACH is mitigated in layers, and a careless application can reintroduce it.

How the Two Compare

Aspect	CRIME (2012)	BREACH (2013)
Compression exploited	TLS-layer compression	HTTP response (gzip) compression
Target secret	Request data, e.g. session cookies	Response data, e.g. CSRF tokens
Clean fix exists?	Yes — disable TLS compression	No single fix; mitigated in layers
Status today	Effectively closed	Still requires app-level care

The Lesson: Length Is Metadata

The enduring lesson of CRIME and BREACH is that encryption protects content, not shape. The size, timing, and frequency of encrypted messages are metadata, and metadata leaks. This is the same family of concern behind traffic-analysis attacks: an adversary who cannot read your messages can still learn a great deal from their dimensions. Compression makes those dimensions depend on content, which is exactly the bridge these attacks walk across.

It is also why modern protocol design treats compression of mixed secret-and-attacker data as a hazard to be reasoned about explicitly. HTTP/2's HPACK and HTTP/3's QPACK header compression, for instance, were designed with these attacks in mind, restricting how sensitive, changing values get compressed so the CRIME pattern cannot trivially recur in header fields.

What This Means for Private Communication

For end-to-end encrypted messaging, the takeaway is that protecting message content is necessary but not sufficient — a serious threat model also has to account for what message sizes reveal. This is why metadata-minimizing design, padding strategies, and careful handling of anything that mixes user-controlled input with secrets matter alongside the core encryption.

Haven's posture is to keep message content encrypted under keys that never leave your device, and to treat metadata exposure as a first-class design concern rather than an afterthought. CRIME and BREACH are a permanent reminder that an attacker who can only see the outline of your traffic is still an attacker — and that the gap between "can't read it" and "can't learn from it" is where a lot of real security work lives.