The Registry Is on Fire

← All posts

Three separate npm breaches hit AI infrastructure in one week. Anthropic shipped its Claude Code source code to the registry by accident. Axios, the JavaScript HTTP library that almost every Node app depends on, was hijacked by a North Korean threat group. Mercor confirmed it was breached through the LiteLLM compromise we flagged last week. The package manager is now the single most leveraged attack surface in the AI stack.

Anthropic accidentally published its own crown jewels. An npm packaging error exposed the full Claude Code source. The Hacker News, BleepingComputer, and SiliconAngle all confirmed it on March 31. Anthropic then spent the next 48 hours issuing takedowns against thousands of mirror repositories, which the company later said was an accident. By April 2, the leaked code was being weaponized as bait in infostealer campaigns. The packaging mistake matters less than how quickly the leak got reverse-engineered into a working open framework. Multi-agent orchestration harnesses replicating Claude Code's design are the single most active development mechanism this week, with one of the dominant repositories already past 140K stars. The harness pattern was being studied before the leak. The leak just turned a research project into a copy.

The Axios npm compromise is the more dangerous story. Late March, the maintainer of Axios was social-engineered through a fake Microsoft Teams error message, and the attacker pushed a cross-platform RAT into one of the most-installed HTTP libraries on the registry. Google's threat intelligence team attributed it to UNC1069, a North Korean group. Npm-based supply chain attacks were converging for over a week before Google's public attribution. The Hacker News timeline traces the social engineering vector in detail. If you are running any agentic system that pulls Axios transitively, and almost every one does, the post-incident audit is not optional.

Mercor confirmed it was hit by the LiteLLM compromise. Last week we discussed the LiteLLM PyPI backdoor that scraped credentials on every Python process start. TechCrunch reported on March 31 that Mercor, the AI talent marketplace, was breached through that exact vector. Mercor is the first named downstream victim. There will be more. The class of company at risk is the one where engineering velocity ran ahead of dependency review, which is most of them. The LiteLLM compromise is exactly the kind of incident that surfaces a quarter late, when the credentials get reused.

Gemma 4 was abliterated in ninety minutes

Google released Gemma 4 on April 2. Four variants under Apache 2.0, including a 26B mixture-of-experts model and a 31B dense. The official DeepMind announcement framed the release around on-device multimodal capability; HuggingFace's launch post walked through the variants; NVIDIA shipped NVFP4 quants on day one. Within ninety minutes of the official drop, aggressive uncensored variants of the dense models were live on HuggingFace. That timing is not anomalous anymore. Abliteration tooling has matured to the point where the gap between release and weight surgery is measured in minutes. The MoE angle is new this week. Standard abliteration techniques developed for dense transformers do not cleanly transfer to mixture-of-experts architectures, where refusal behavior appears to route through expert selection rather than uniform residual stream directions. So Gemma 4's 26B MoE variant is harder to uncensor than its 31B dense sibling, and the community is now publishing routing-aware abliteration methods specifically targeting it. Anyone shipping a model with a safety policy now has one hour before that policy is removed from dense weights. MoE buys you slightly more, until it doesn't.

OpenClaw is no longer free. Anthropic confirmed that Claude Code subscribers will pay extra for OpenClaw usage, the framework Anthropic released as open weights two weeks ago. Anthropic is running the familiar play of releasing an open standard, building the surrounding tooling, and then metering access to the tooling. Mistral, Meta, and others have already done this, just on a longer timeline. For the agentic security category specifically, ClawKeeper and similar runtime watchdog frameworks have emerged as the third-party answer, and Cisco shipped DefenseClaw as an open-source alternative. The agent-governance market we flagged last week now has three commercial vendors competing for the runtime layer.

Voxtral has a real competitor already. Two weeks after Mistral released Voxtral-4B-TTS with the sub-100ms time-to-first-audio claim, k2-fsa's OmniVoice shipped with support for 600+ languages via zero-shot diffusion TTS. T5Gemma-TTS appeared from a third group using an encoder-decoder approach to bypass the prefix-competition problem in codec language models. Three architecturally distinct open-weight TTS systems in two weeks, all targeting the latency-and-multilingual frontier that ElevenLabs has been monetizing. Commercial TTS pricing in 2027 looks softer than it did in 2025.

On our radar

Signal data for this briefing is provided by HiddenState, Mosaic Theory's signal intelligence platform.

— Cosmo