Typosquatting: How One Mistyped Package Owns Your Project

Package registries like npm and PyPI run on trust. You type a name, the tool fetches it, your build runs whatever code was on the other end. When someone uploads a package with a name one keystroke away from a popular library, that small typo becomes a fully privileged execution channel into every project that imports it.

In 2017, a security researcher named Hanno Böck uploaded a series of packages to PyPI with names like urllib, bzip, and setuptool — each one a near-miss of a popular library. The packages did nothing malicious; they just phoned home and counted. Over a few weeks, those typo packages were downloaded tens of thousands of times. Many of those downloads came from inside what looked like real CI systems and developer machines.

The lesson was uncomfortable: an entire class of attack was sitting open in the most-used software supply chain on earth, and the cost to defenders was approximately zero.

What Typosquatting Actually Is

A typosquat is a package uploaded to a registry with a name designed to be confused with a legitimate one. The variations are predictable:

Single-character substitution: requets for requests, colorma for colorama.
Singular vs plural: request vs requests.
Hyphen vs underscore: python-dateutil vs python_dateutil.
Scoped vs unscoped: @lodash/lodash vs lodash on npm.
Common transposition: cross-env vs crossenv.
Brand reuse: legitimate-looking names that resemble a corporate or framework brand (azure-storage-uploader when no such official package exists).

The attack succeeds whenever a developer, a CI system, or a copy-pasted README contains the wrong name. The wrong name resolves to the attacker's package. The attacker's setup.py, package.json install hook, or post-install script executes — on a developer laptop, in a CI runner, or in production if the package is shipped into a container image.

What the Attacker Gets

Package managers execute code at install time. That gives a malicious typosquat:

Read access to whatever environment variables exist on the machine — including cloud credentials, AWS keys, GitHub tokens, Slack hooks.
The ability to write to the developer's home directory — SSH keys, browser profiles, password manager databases.
Network egress, often unrestricted — exfiltration to any URL.
The chance to inject malicious code into the project itself, so it persists after install.

The 2018 event-stream incident — though technically a maintainer-takeover attack rather than a typosquat — illustrated what's possible. A maliciously published version of a popular npm package included code targeting a specific Bitcoin wallet application, attempting to exfiltrate private keys from any environment that built it. The package was a transitive dependency, so most affected projects had no direct relationship with it.

Dependency Confusion: The Even Worse Variant

In 2021, researcher Alex Birsan published a paper demonstrating dependency confusion — an attack closely related to typosquatting that doesn't require a typo at all.

Many companies use internal package names (e.g. my-corp-auth-lib) hosted on private registries. Their package managers, when faced with a package name, often search both private and public registries — and prefer the higher version number, wherever it comes from.

Birsan uploaded packages with internal-sounding names — names he'd seen in leaked manifests on GitHub or in public job postings — to the public PyPI and npm registries, with very high version numbers. Builds at Microsoft, Apple, PayPal, Tesla, Yelp, Uber, and several dozen other companies pulled his packages instead of their internal versions. He earned over $130,000 in bounties for what was essentially a name-collision attack.

The structural problem

Package managers were designed to resolve names to the newest version available. They were not designed to answer the question, "is this package from the source we expect?" The default trust model assumes name uniqueness across a single global namespace — an assumption that doesn't survive contact with private registries and corporate naming.

Why It Persists

Registries have invested in detection. npm and PyPI both run automated scanners that flag suspicious packages based on names, install-time behavior, and reputation. Many obvious typosquats are taken down within days. But a few realities make complete prevention hard:

The defender's gap

For every legitimate package, there are dozens of plausibly confusable names. Defending all of them preemptively is impractical, and registries are reluctant to lock down namespace policy in ways that would frustrate legitimate developers.

The economic asymmetry

Uploading a malicious package is cheap. Detection, takedown, and remediation are expensive. The attacker only needs one or two installs to succeed; the defender needs every install to be safe.

The supply chain is deep

The average modern application transitively depends on hundreds, sometimes thousands, of packages. Even a security-conscious developer cannot review them all. A typosquat in a transitive dependency is invisible to the project's direct contributors.

What Actually Defends Against It

Defense	Approach
Lockfiles	`package-lock.json`, `poetry.lock`, `Cargo.lock`: pin every transitive dependency to exact versions and hashes. Once committed, the same names always resolve to the same content.
Hash verification	pip's `--require-hashes`, npm's `--audit-signatures`: reject any package whose content doesn't match the expected hash, even if the version matches.
Private registry mirroring	Proxy a known set of dependencies through a private registry that an attacker can't publish to. Builds only resolve against the mirror.
Scope reservations	For npm, register a scope for your organization (`@your-org/`) so all internal packages live in a namespace only your team can publish to.
Dependency review	GitHub's Dependabot, Snyk, OSV-Scanner, Socket.dev — automated review of every dependency change that flags new packages, suspicious patterns, and known malicious uploads.
Disable install scripts	`npm install --ignore-scripts` stops package post-install hooks from running. Breaks some legitimate packages; closes the most direct attack channel.

A Note on Reproducible Builds

Reproducible builds are a longer-term answer to the same family of problems. If the same source produces a byte-identical artifact every time, regardless of who builds it, then any divergence from the expected hash is a signal that something has been tampered with — including a typosquatted dependency that replaced legitimate code.

Reproducibility doesn't prevent the initial compromise, but it makes detection vastly cheaper. Combined with signing and provenance tooling like SLSA, it shifts the supply chain from "we trust the registry" to "we verify the artifact end to end."

The Practical Habits

For individual developers and small teams:

Commit lockfiles, and treat lockfile diffs in code review as security-sensitive.
Pin versions exactly in production manifests; don't use unconstrained ^ ranges for security-sensitive code.
When you copy an install command from a tutorial, sanity-check the package name against its homepage or the registry's official page.
Use npm install --ignore-scripts when bringing in a brand-new package for the first time, and inspect what's inside.
If your IDE autocompletes a package name that doesn't quite look right, slow down. Many typosquats land precisely because an autocomplete picked the wrong entry.

The Quiet Lesson

Package registries are the most trusted infrastructure in modern software development, and they were not designed with active adversarial publishing in mind. Each registry has accumulated mitigations, but the fundamental model — globally unique names, run code at install time, fetch the newest version — was set decades ago and is difficult to change without breaking everything.

Typosquatting persists because it exploits the model itself, not any specific implementation bug. Until the model evolves — toward verified provenance, signed releases, and namespaces that map cleanly to organizational identity — the burden remains on every developer and every CI system to be a little more careful than the tools demand.