Format-Preserving Encryption: When Ciphertext Has to Look Like a Credit Card

Standard encryption turns a 16-digit card number into a block of random bytes that no longer fits the database column it came from. Format-preserving encryption keeps the shape: 16 digits in, 16 different digits out, still parseable, still the right length. It is a narrow tool built for a specific legacy problem, and understanding why it exists tells you a lot about how encryption meets old systems.

Imagine a decades-old payments database. The schema has a column defined as exactly 16 numeric digits for the card number, with downstream code that validates the length, runs a checksum, and prints the last four digits on receipts. Now you are told to encrypt that column. Reach for AES in the usual way and the output is 16 bytes of arbitrary binary, which does not fit a numeric-only 16-character field, breaks every validation rule, and forces a schema migration across systems nobody fully understands anymore.

Format-preserving encryption, FPE, exists for exactly this. It encrypts a value so the ciphertext stays in the same format as the plaintext. A 16-digit number becomes another 16-digit number. A US Social Security number becomes another nine-digit string. The encrypted value drops into the existing column without touching the schema, the validators, or the surrounding code.

What "Format" Means Here

"Format" really means an alphabet and a length. The input is a string over some alphabet, the digits 0 through 9, or the letters A through Z, or any defined set, and FPE produces an output of the same length over that same alphabet. Mathematically, FPE is a pseudorandom permutation over a finite set. If the plaintext is one of the ten billion possible 10-digit numbers, the ciphertext is some other 10-digit number, and the encryption is a reversible, key-dependent shuffle of that whole set.

The core idea

A normal block cipher is a permutation over all 128-bit blocks. FPE builds a permutation over a much smaller, oddly shaped set, like "all 16-digit decimal strings," using a normal cipher as the engine underneath. The hard part is doing that securely when the set is small and not a clean power of two.

How FF1 and FF3-1 Work

The recognized constructions come from NIST Special Publication 800-38G, which standardized two methods: FF1 and FF3-1. Both are built on a Feistel network, the same general structure that underlies many classic ciphers.

A Feistel network splits the input into two halves and runs several rounds. In each round, one half is fed through a keyed round function, the result is combined with the other half, and the halves swap. After enough rounds, the output is thoroughly scrambled, and because each step is reversible, decryption is just the rounds run backward. FPE adapts this by doing the arithmetic in the right base. For decimal data the round function works modulo a power of ten rather than with the usual binary XOR, so the halves stay as valid digit strings throughout.

The round function itself uses AES underneath. FF1 runs ten rounds and supports a flexible tweak; FF3-1 runs eight rounds with a fixed-size tweak. Both lean on AES for their security, but wrap it in the Feistel structure so the output lands back in the original alphabet and length.

The Tweak: FPE's Most Important Input

Alongside the key, FPE takes a second input called a tweak. The tweak is not secret, but it changes the permutation. Two identical plaintexts encrypted under the same key but different tweaks produce different ciphertexts.

This matters because FPE has no room for a random nonce the way AES-GCM does. The output has to be the same length as the input, so there is nowhere to stash extra randomness. Without a tweak, encrypting the same card number always yields the same ciphertext, which leaks equality: an attacker can see that two records hold the same value even without decrypting them. Feeding a per-record value such as an account ID as the tweak breaks that pattern. Choosing the tweak well is the single most consequential design decision when deploying FPE.

Property	AES-GCM	FPE (FF1 / FF3-1)
Output size	Plaintext plus nonce and tag	Identical to plaintext
Output alphabet	Arbitrary binary	Same as input (digits, letters)
Built-in integrity check	Yes, authentication tag	No, none
Randomization per message	Random nonce	Only via the tweak

What FPE Does Not Give You

The table points at the catch, and it is worth saying plainly. FPE is confidentiality only. It carries no authentication tag, so it cannot tell you whether a ciphertext was tampered with. A flipped digit decrypts to some other valid-looking value with no error raised. If you need to detect tampering, FPE alone is the wrong tool, and you have to add integrity protection separately.

There is also an inherent limit tied to the small domain. When the set of possible values is small, a five-digit ZIP code has only 100,000 possibilities, the security margin shrinks, and FF3-1 in particular received revisions (the "-1") after researchers found attacks against very small domains in the original FF3. NIST sets minimum domain sizes for this reason. FPE on a field with only a few thousand possible values is fragile, and sometimes tokenization, swapping the value for an unrelated random token tracked in a secure lookup table, is the safer choice.

FPE vs tokenization

Tokenization replaces sensitive data with a random stand-in and stores the mapping in a vault. There is no key to crack because there is no mathematical relationship between token and value. FPE keeps a reversible cryptographic relationship and needs no vault. The trade is vault-storage-and-lookup against key-management-and-domain-size. For very small domains, tokenization usually wins.

Where It Is Actually Used

FPE earns its place in regulated, legacy-heavy environments: payment processing under PCI DSS, where card numbers must be protected but downstream systems expect the card-number shape; data warehouses that need to encrypt fields without reshaping every table; and test or analytics environments that need realistic-looking but de-identified data. In each case the value is not stronger cryptography, it is encryption that fits into a system you cannot afford to rebuild.

That is the honest framing. FPE is not a general-purpose encryption upgrade and you should not reach for it when you control the format. For new systems, store data with authenticated encryption and let ciphertext be ciphertext. FPE is a specialist that solves the problem of bringing encryption to a schema that was never designed to hold it.

The Wider Lesson

FPE is a reminder that cryptography rarely fails on the math. It fails at the seams where strong primitives meet real systems with constraints, legacy formats, fixed column widths, validators, and code nobody wants to touch. The interesting engineering is almost always in making the right primitive fit the actual environment without weakening it.

That is the same discipline we apply at Haven. The goal is not to show off an exotic algorithm; it is to use well-understood, standardized cryptography correctly and place the trust boundaries where they belong. If you want more on that theme, our piece on cryptographic agility covers how systems plan for primitives to change, and our overview of authenticated encryption explains the default you should reach for first.