There's a persistent and understandable misconception that encrypted traffic is private traffic. It isn't — not fully. Encryption is a powerful tool that makes the contents of communication unreadable to anyone without the appropriate keys. But the envelope remains visible even when the letter inside is sealed. Traffic analysis is the discipline of extracting information from the envelope.
What an Observer Can See
Consider what an adversary positioned on your network — your ISP, a surveillance system monitoring backbone traffic, or someone on the same Wi-Fi network — can observe when you use an end-to-end encrypted messaging app:
- IP addresses of both endpoints (your device and the server)
- Timing of every packet — when messages are sent and received, how quickly responses arrive
- Packet sizes — a short message and a long message produce different-sized packets even if both are encrypted
- Traffic volume — a voice call, a video call, a file transfer, and a text message all have distinct bandwidth fingerprints
- Connection frequency — how often your device contacts the messaging server
Former NSA Director Michael Hayden has publicly stated: "We kill people based on metadata." Traffic metadata — who contacts whom and when — has been treated by governments as less legally protected than content. But it reveals comparable information about social networks, behaviors, and relationships.
Even without knowing the server IP (if the app uses a CDN or proxies through a third party), timing and packet size correlations can often identify the application from the traffic pattern alone. This is called traffic classification, and it's well-studied in the academic literature.
Timing Correlation Attacks
Timing correlation is a specific class of traffic analysis that requires an adversary who can observe traffic at two different points simultaneously — both near the sender and near the receiver. The attack works by correlating the timing of packets entering a network anonymization system with packets exiting it, matching flows based on timing patterns.
This is the primary known attack vector against Tor, the most widely deployed anonymity network. Tor routes traffic through multiple relays to prevent any single relay from knowing both the source and destination. But an adversary who controls the entry relay (or observes the user's ISP) and the exit relay (or observes the destination server) can correlate incoming and outgoing traffic by timing alone, without breaking any encryption.
Academic research from 2014 through the present has demonstrated that timing correlation can de-anonymize Tor users at meaningful success rates under realistic adversary models — particularly when the adversary is a large national surveillance system that can observe significant portions of internet backbone traffic. This doesn't make Tor useless; it makes Tor's anonymity guarantees conditional on the adversary's capabilities.
Packet Size and Website Fingerprinting
Even a single HTTPS connection — which encrypts all content — leaks information through packet sizes and the sequence of packets over time. Web pages load resources in a distinctive order and pattern. The combination of total bytes transferred, number of requests, timing of resource loads, and packet size distribution creates a fingerprint that identifies which page was loaded.
This is called website fingerprinting. Researchers have demonstrated that classifiers trained on traffic traces can identify which of a set of candidate websites was visited with high accuracy, even through a VPN or Tor. The attack requires the adversary to know which websites you might visit (or to catalog many possibilities), but for common sites the attack is practical.
Defenses exist — padding packets to uniform sizes, introducing artificial timing delays to obscure real timing patterns, and multiplexing multiple requests over a single connection (HTTP/2 and HTTP/3 both help here). But these defenses impose performance costs, and few commercial applications implement them.
Graph Analysis: What Patterns Reveal Over Time
Even if individual communications are anonymous, patterns over time can reconstruct social graphs with high accuracy. If an unknown phone communicates with Alice at 9pm, then with Bob at 9:05pm, then with Carol at 9:10pm — and Alice, Bob, and Carol are all known to be part of an organizing committee for a particular group — the unknown phone is likely associated with that group regardless of what was actually said.
This is metadata surveillance at scale, and it's operationally useful to adversaries even with strong content encryption in place. Call detail records — who called whom and for how long — have been collected under NSA programs specifically because the content of calls was too voluminous to analyze but the graph structure of the call metadata was highly informative.
| Attack Type | What It Reveals | Required Adversary Position |
|---|---|---|
| Timing correlation | Source-destination linkage through anonymizers | Observation at both endpoints of anonymization path |
| Website fingerprinting | Which websites were visited via HTTPS/VPN | Observation of encrypted traffic stream |
| Traffic classification | Which application is running (call vs. text vs. video) | Any network observer |
| Graph analysis | Social network structure from communication patterns | Aggregated connection metadata over time |
| Volume analysis | Whether large files are being transferred; infer content type | Any network observer |
How Messaging Apps Address (or Don't Address) Traffic Analysis
Signal has invested specifically in traffic analysis resistance beyond most of its competitors. Sealed Sender, introduced in 2018, hides the sender's identity in the metadata of delivered messages — the server receives a message addressed to a recipient but cannot determine who sent it without the recipient's cooperation. Signal also pads message sizes to fixed-size buckets to reduce packet-size leakage.
Tor provides strong anonymization but introduces latency that makes real-time communication (voice calls, live chat) awkward. Anonymizing networks that introduce sufficient latency to defeat timing correlation — called mix networks — are theoretically stronger against traffic analysis than Tor, but their latency is too high for interactive communication. There's a fundamental tension between real-time communication and traffic analysis resistance that no current consumer product has fully resolved.
The honest assessment: traffic analysis resistance is an active research area, not a solved problem. Any messaging app that claims complete protection from traffic analysis is overstating its guarantees. The question is how much resistance they provide under which adversary models.
Practical Implications for Users
For most users, traffic analysis is not a practical threat from the adversaries they face. ISPs sell browsing data to data brokers; that's a legitimate privacy concern, but it's a commercial data aggregation problem, not a targeted traffic analysis attack. The adversaries capable of mounting serious traffic analysis attacks at scale are national intelligence agencies — a relevant threat model for journalists, activists, and whistleblowers, but not for the typical private user.
The relevant takeaways:
- A VPN hides your traffic from your ISP but does not provide anonymity; the VPN provider sees the same metadata your ISP would have seen.
- Tor provides meaningful anonymity against non-state adversaries, but is not robust against nation-state timing correlation attacks.
- End-to-end encryption remains essential — traffic analysis is harder and less informative when content is protected, even if patterns still leak.
- Messaging apps that route all traffic through central servers create a single point where metadata is concentrated, regardless of content encryption.
Traffic analysis is not a reason to abandon encryption — it's a reason to be clear-eyed about what encryption does and doesn't protect.