Traffic Analysis: The Threat That Encryption Can't Stop

Strong encryption protects the content of your communications. But the patterns of those communications — who you contact, how often, when, and for how long — remain visible to anyone who can observe the network. Traffic analysis extracts meaning from those patterns without breaking a single cipher.

There's a persistent and understandable misconception that encrypted traffic is private traffic. It isn't — not fully. Encryption is a powerful tool that makes the contents of communication unreadable to anyone without the appropriate keys. But the envelope remains visible even when the letter inside is sealed. Traffic analysis is the discipline of extracting information from the envelope.

What an Observer Can See

Consider what an adversary positioned on your network — your ISP, a surveillance system monitoring backbone traffic, or someone on the same Wi-Fi network — can observe when you use an end-to-end encrypted messaging app:

IP addresses of both endpoints (your device and the server)
Timing of every packet — when messages are sent and received, how quickly responses arrive
Packet sizes — a short message and a long message produce different-sized packets even if both are encrypted
Traffic volume — a voice call, a video call, a file transfer, and a text message all have distinct bandwidth fingerprints
Connection frequency — how often your device contacts the messaging server

The metadata problem

Former NSA Director Michael Hayden has publicly stated: "We kill people based on metadata." Traffic metadata — who contacts whom and when — has been treated by governments as less legally protected than content. But it reveals comparable information about social networks, behaviors, and relationships.

Even without knowing the server IP (if the app uses a CDN or proxies through a third party), timing and packet size correlations can often identify the application from the traffic pattern alone. This is called traffic classification, and it's well-studied in the academic literature.

Timing Correlation Attacks

Timing correlation is a specific class of traffic analysis that requires an adversary who can observe traffic at two different points simultaneously — both near the sender and near the receiver. The attack works by correlating the timing of packets entering a network anonymization system with packets exiting it, matching flows based on timing patterns.

This is the primary known attack vector against Tor, the most widely deployed anonymity network. Tor routes traffic through multiple relays to prevent any single relay from knowing both the source and destination. But an adversary who controls the entry relay (or observes the user's ISP) and the exit relay (or observes the destination server) can correlate incoming and outgoing traffic by timing alone, without breaking any encryption.

Academic research from 2014 through the present has demonstrated that timing correlation can de-anonymize Tor users at meaningful success rates under realistic adversary models — particularly when the adversary is a large national surveillance system that can observe significant portions of internet backbone traffic. This doesn't make Tor useless; it makes Tor's anonymity guarantees conditional on the adversary's capabilities.

Packet Size and Website Fingerprinting

Even a single HTTPS connection — which encrypts all content — leaks information through packet sizes and the sequence of packets over time. Web pages load resources in a distinctive order and pattern. The combination of total bytes transferred, number of requests, timing of resource loads, and packet size distribution creates a fingerprint that identifies which page was loaded.

This is called website fingerprinting. Researchers have demonstrated that classifiers trained on traffic traces can identify which of a set of candidate websites was visited with high accuracy, even through a VPN or Tor. The attack requires the adversary to know which websites you might visit (or to catalog many possibilities), but for common sites the attack is practical.

Defenses exist — padding packets to uniform sizes, introducing artificial timing delays to obscure real timing patterns, and multiplexing multiple requests over a single connection (HTTP/2 and HTTP/3 both help here). But these defenses impose performance costs, and few commercial applications implement them.

Graph Analysis: What Patterns Reveal Over Time

Even if individual communications are anonymous, patterns over time can reconstruct social graphs with high accuracy. If an unknown phone communicates with Alice at 9pm, then with Bob at 9:05pm, then with Carol at 9:10pm — and Alice, Bob, and Carol are all known to be part of an organizing committee for a particular group — the unknown phone is likely associated with that group regardless of what was actually said.

This is metadata surveillance at scale, and it's operationally useful to adversaries even with strong content encryption in place. Call detail records — who called whom and for how long — have been collected under NSA programs specifically because the content of calls was too voluminous to analyze but the graph structure of the call metadata was highly informative.

Attack Type	What It Reveals	Required Adversary Position
Timing correlation	Source-destination linkage through anonymizers	Observation at both endpoints of anonymization path
Website fingerprinting	Which websites were visited via HTTPS/VPN	Observation of encrypted traffic stream
Traffic classification	Which application is running (call vs. text vs. video)	Any network observer
Graph analysis	Social network structure from communication patterns	Aggregated connection metadata over time
Volume analysis	Whether large files are being transferred; infer content type	Any network observer

How Messaging Apps Address (or Don't Address) Traffic Analysis

Signal has invested specifically in traffic analysis resistance beyond most of its competitors. Sealed Sender, introduced in 2018, hides the sender's identity in the metadata of delivered messages — the server receives a message addressed to a recipient but cannot determine who sent it without the recipient's cooperation. Signal also pads message sizes to fixed-size buckets to reduce packet-size leakage.

Tor provides strong anonymization but introduces latency that makes real-time communication (voice calls, live chat) awkward. Anonymizing networks that introduce sufficient latency to defeat timing correlation — called mix networks — are theoretically stronger against traffic analysis than Tor, but their latency is too high for interactive communication. There's a fundamental tension between real-time communication and traffic analysis resistance that no current consumer product has fully resolved.

The honest assessment: traffic analysis resistance is an active research area, not a solved problem. Any messaging app that claims complete protection from traffic analysis is overstating its guarantees. The question is how much resistance they provide under which adversary models.

Practical Implications for Users

For most users, traffic analysis is not a practical threat from the adversaries they face. ISPs sell browsing data to data brokers; that's a legitimate privacy concern, but it's a commercial data aggregation problem, not a targeted traffic analysis attack. The adversaries capable of mounting serious traffic analysis attacks at scale are national intelligence agencies — a relevant threat model for journalists, activists, and whistleblowers, but not for the typical private user.

The relevant takeaways:

A VPN hides your traffic from your ISP but does not provide anonymity; the VPN provider sees the same metadata your ISP would have seen.
Tor provides meaningful anonymity against non-state adversaries, but is not robust against nation-state timing correlation attacks.
End-to-end encryption remains essential — traffic analysis is harder and less informative when content is protected, even if patterns still leak.
Messaging apps that route all traffic through central servers create a single point where metadata is concentrated, regardless of content encryption.

Traffic analysis is not a reason to abandon encryption — it's a reason to be clear-eyed about what encryption does and doesn't protect.