OMEMO: How XMPP Got Modern End-to-End Encryption

For years, encrypting an XMPP chat meant choosing between OpenPGP (no forward secrecy, painful key management) and OTR (forward-secret, but single-device and no offline delivery). OMEMO ended that compromise by porting Signal's Double Ratchet onto a federated, multi-device protocol. Here's how it works — and where it still falls short.

XMPP — the open instant-messaging standard once branded Jabber — has carried messages since 1999. It is federated like email: anyone can run a server, and accounts on different servers talk to each other. That openness is its strength and, for encryption, its historical headache. Federation means there is no single vendor to mandate a crypto scheme, no central key directory, and no guarantee that the person you are messaging runs the same client you do.

OMEMO — a recursive-ish acronym for "OMEMO Multi-End Message and Object Encryption," standardized as XEP-0384 by the XMPP Standards Foundation — was the proposal that finally gave XMPP encryption people actually wanted to use. It was introduced in 2015 by a then-teenage developer, Andreas Straub, as a Google Summer of Code project, and it works by reusing the cryptographic core of the Signal Protocol.

The Problem OMEMO Solved

Before OMEMO, the two real options for encrypted XMPP both had structural flaws:

OpenPGP (XEP-0027) encrypts each message to a long-lived public key. If that key is ever compromised, every past message it protected can be decrypted. There is no forward secrecy, and key discovery is manual.
OTR (Off-the-Record Messaging) added forward secrecy and deniability, but it was designed for a single live session between two devices. It could not deliver to an offline contact, and it broke down the moment you logged in from a second device — a near-universal expectation by the mid-2010s.

OMEMO targeted exactly those gaps: forward secrecy and offline delivery and multi-device, on a federated network, with a usability bar low enough for mainstream clients.

The Double Ratchet, Borrowed Wholesale

OMEMO's confidentiality guarantees come from the same machinery that protects Signal and WhatsApp messages: the Double Ratchet algorithm, paired with an X3DH-style key agreement (the original OMEMO used the older Axolotl naming).

The short version: each pair of devices establishes a shared secret using a mix of long-term identity keys, medium-term signed prekeys, and one-time prekeys. From that root secret, the Double Ratchet derives a fresh message key for every single message, advancing — "ratcheting" — the key material forward with each send and each Diffie-Hellman step. The properties that buys you:

Forward secrecy — compromising today's keys does not expose yesterday's messages.
Post-compromise security (self-healing) — after an attacker captures a key, the next DH ratchet step locks them back out.

Why reuse, not reinvent

OMEMO deliberately did not design new cryptography. By adopting the Double Ratchet — already analyzed, deployed at scale, and well understood — it inherited a vetted security argument instead of asking the world to trust a brand-new construction. This is sound engineering: novel protocols are where subtle breaks hide.

Multi-Device Without a Central Server

The cleverest part of OMEMO is how it handles multiple devices on a federated network. Every device a user logs in from generates its own identity key and publishes a device bundle — its public identity key, a signed prekey, and a batch of one-time prekeys — to an XMPP PEP node (Personal Eventing Protocol, a per-user pub/sub store on their server).

When you send a message, your client fetches the device lists for every participant, then encrypts the message payload once with a fresh symmetric key, and encrypts that key separately for each recipient device using its OMEMO session. So a message to a contact with a phone and a laptop, sent from your two devices, gets the body encrypted once and the key wrapped N times — one per destination device. This is the standard "encrypt-to-many" envelope pattern, and it is why adding a device is cheap.

The catch is the flip side of federation: there is no authority asserting which devices legitimately belong to a user. If an attacker who controls a server (or compels its operator) silently adds a rogue device to your published device list, your client may helpfully encrypt to it too — unless you verify fingerprints.

Trust, Verification, and the Blind-Trust Trade-off

Like every public-key system, OMEMO's security rests on verifying you have the right keys. Each device key has a fingerprint users can compare out-of-band — in person, over a verified channel, or via QR code in better clients.

In practice, most clients default to Blind Trust Before Verification (BTBV): they automatically trust all of a contact's keys until you manually verify one, after which new unverified keys are flagged. This is the same trust-on-first-use compromise that pervades usable cryptography — it trades a window of vulnerability for adoption, on the bet that an attacker is unlikely to be man-in-the-middling your very first exchange. It is a reasonable default, but it is a default, and high-risk users should verify fingerprints explicitly.

What OMEMO Does Not Encrypt

OMEMO protects message bodies and, in its current versions, file transfers and other stanza payloads. It does not hide metadata. Your XMPP server still sees who you talk to, when, how often, and from which devices. Federation makes this worse in one sense — your conversation's metadata is visible to both participants' servers.

Versions matter here, too. The original OMEMO used AES in a mode that did not authenticate the full ciphertext envelope cleanly; the newer revision (the "OMEMO 2" line, XEP-0384 version 0.4+ with Stanza Content Encryption) tightened the construction and improved what gets encrypted. Two clients only get OMEMO's full guarantees if they implement compatible versions — an interoperability seam that federation makes unavoidable.

Property	OpenPGP/XMPP	OTR	OMEMO
Forward secrecy	No	Yes	Yes
Offline delivery	Yes	No	Yes
Multi-device	Manual	No	Yes
Hides metadata	No	No	No

OMEMO Versus the Group Problem

OMEMO handles groups by treating them as encrypt-to-many: a group message is encrypted to every member's every device. This works, but it scales linearly and lacks the cryptographic group-state machinery of a protocol designed for groups from the start. The newer MLS standard (RFC 9420) uses a tree-based key schedule that makes membership changes efficient and gives the whole group consistent, forward-secret state — a structurally cleaner answer to large groups than fan-out encryption.

OMEMO's achievement was never novelty — it was bringing a proven ratchet to an open, federated network that vendor-controlled apps had quietly left behind.

Where This Leaves You

OMEMO is a genuine success story: a volunteer-driven standard that gave a 25-year-old open protocol the same confidentiality guarantees as the big commercial messengers, without a central gatekeeper. If you value XMPP's federation and self-hostability, OMEMO is the right encryption layer to insist on, and verifying fingerprints closes its main practical gap.

But "the message body is encrypted" is only one axis of a threat model. Metadata exposure, group scaling, and version fragmentation are real. At Haven we made a different bet — a single identity spanning encrypted email and MLS-based group chat in one app — precisely so the protocol seams don't land on the user. OMEMO and Haven answer the same question with different priorities; the honest move is to match the tool to the threat model you actually have.