Xiaomi Hunter Alpha DeepSeek V4 mystery solved. For a week, the AI world speculated whether Hunter Alpha was DeepSeek’s next blockbuster. Then Xiaomi dropped a bombshell. Here is the full story of the stealth model, the former DeepSeek researcher behind it, and what it means for the AI agent wars.
The Hunter Alpha Mystery Is Solved: It’s Not DeepSeek V4—It’s Xiaomi’s Bold AI Play
For seven days, the AI developer community played detective.
A mysterious model named Hunter Alpha appeared on OpenRouter, the AI gateway platform, on March 11, 2026. It was listed as a “stealth model”—no developer attribution, no press release, no explainer. Just a set of jaw-dropping specifications and a chatbot that politely declined to identify its creators .
The specs were enough to make any AI engineer stop scrolling:
-
1 trillion total parameters (activating approximately 42 billion per token)
-
A 1-million-token context window
-
Reasoning capabilities that felt distinctly familiar
The chatbot told Reuters it was “a Chinese AI model primarily trained in Chinese” with a knowledge cutoff of May 2025—the exact cutoff as DeepSeek’s own models. When pressed about its origin, it responded: “I only know my name, my parameter scale and my context window length” .
The internet did what the internet does best: it jumped to conclusions. DeepSeek V4, the most anticipated open-source AI release of 2026, was rumored to launch as early as April with exactly these specifications . The timing was perfect. The specs were identical. The reasoning patterns, according to some engineers, bore the fingerprints of DeepSeek’s training methodology .
Developer forums lit up. “The chain-of-thought pattern is probably the strongest signal,” Daniel Dewhurst, an AI engineer who analyzed the model, told Reuters. “Reasoning style is hard to disguise and tends to reflect how a model was trained” .
But not everyone was convinced. Umur Ozkul, who runs independent AI benchmark tests, cautioned the community: “My analysis suggests Hunter Alpha is likely not DeepSeek V4,” citing differences in token-related behavior and architectural patterns .
Then, on March 19, the mystery ended—not with a whimper, but with a bang.
Xiaomi Steps Into the Spotlight
In the early hours of March 19, Xiaomi officially claimed Hunter Alpha as its own .
The company announced its new MiMo-V2 series, revealing that Hunter Alpha had been an “early internal test build of MiMo-V2-Pro”—a model designed not merely as a chatbot, but as the “brain” for AI agents capable of executing complex, multi-step tasks with minimal human supervision .
Luo Fuli, the head of Xiaomi’s MiMo AI model team and a former DeepSeek researcher, took to social media to explain what had just happened. Her post was part technical explainer, part manifesto, and entirely fascinating .
“I call it a silent ambush,” she wrote. “Not because we planned it that way, but because the shift from conversation mode to agent mode happened so fast that even we couldn’t believe it” .
The Woman Behind the Model
If Hunter Alpha has a face, it belongs to Luo Fuli.
Her presence at the helm of Xiaomi’s MiMo team immediately explained why the AI community had immediately suspected DeepSeek’s involvement. Before joining Xiaomi, Luo played a significant role in building DeepSeek’s R1 model. The “reasoning style” that developers detected in Hunter Alpha wasn’t a coincidence—it was a signature .
“People ask why we moved so fast,” she reflected in her post. Her answer was refreshingly honest: long cycles for infrastructure research, post-training agility driven by product intuition, and “a genuine love for the world you’re creating” .
She also shared an unconventional management tactic. After experiencing a complex agent framework for the first time, she was so convinced of its potential that she issued a directive to her team: anyone who didn’t have at least 100 conversations with the system by the next day could resign. It worked. “Once the team’s imagination was sparked,” she wrote, “it translated directly into research speed” .
What MiMo-V2-Pro Actually Does
Let’s move past the mystery and look at the machine.
MiMo-V2-Pro is not designed to be another friendly chatbot. Xiaomi is building for the agentic era—systems that don’t just answer questions but execute tasks .
The 1-trillion-parameter foundation model uses a hybrid attention mechanism that prioritizes efficiency. Its 1-million-token context window allows it to ingest entire codebases, massive document collections, or extended multi-step task histories in a single session .
In benchmark tests, the model performs competitively:
-
ClawEval (Agent capability): 61.5, approaching Claude Opus 4.6
-
PinchBench: Ranked top three globally
-
Coding performance: Exceeds Claude Sonnet 4.6
On the Artificial Analysis leaderboard, which ranks models on comprehensive intelligence, MiMo-V2-Pro sits at eighth globally and second in China .
But benchmarks only tell part of the story. The model’s real-world demonstrations are where things get interesting.
In one test, a developer asked MiMo-V2-Pro to generate a complete 3D tower defense game—multiple turret types, varied enemy mechanics, level design, explosions, flame effects, and Three.js rendering with pause, restart, and scoring functions. The model delivered a fully structured code solution covering game logic and frontend implementation .
In another, more whimsical test, the model was asked to “simulate the visual style of 1990s print magazines, including irregular multi-column layouts, bleed titles, paper texture backgrounds, and interactive design with page-flip effects.” MiMo-V2-Pro understood the complex aesthetic description and generated a complete frontend implementation with font choices, layout structure, and dynamic effects .
This is the distinction Xiaomi wants you to understand: MiMo-V2-Pro doesn’t just generate content; it generates systems.
The Agent Ecosystem
Xiaomi didn’t stop with a single model.
The company simultaneously launched MiMo-V2-Omni and MiMo-V2-TTS, completing a trilogy designed to power agentic applications from end to end .
MiMo-V2-Omni is a full-modal foundation model that aligns audio, image, and video understanding. For agent applications, this means the system can ingest real-time surveillance footage, interpret audio commands, analyze visual interfaces, and convert environmental information into logical decisions—all within the same reasoning framework .
MiMo-V2-TTS tackles the output side with fine-grained emotional expression. The model supports multiple Chinese dialects (Northeastern, Sichuanese, Cantonese) and can even generate singing voices naturally. For developers building assistants that need to sound human, this matters .
The pricing strategy is aggressive:
-
MiMo-V2-Pro: $1 input / $3 output per million tokens (256K context); $2 input / $6 output (1M context)
-
MiMo-V2-Omni: $0.4 input / $2 output per million tokens
For comparison, Claude Opus-class models typically charge significantly more. Xiaomi is clearly playing the long game: attract developers, build ecosystem, iterate fast .
Why the Confusion Was Inevitable
Looking back, it’s easy to understand why the AI community initially cried “DeepSeek V4.”
The specifications were identical to those rumored for DeepSeek’s upcoming release: 1 trillion parameters, 1 million token context, May 2025 knowledge cutoff . The reasoning patterns felt familiar. The timing—just weeks before DeepSeek V4’s expected April launch—was too perfect .
Add to that the Luo Fuli connection. A researcher who helped build DeepSeek R1 now leading Xiaomi’s AI efforts? The through-line was irresistible.
But the reality is more interesting. Xiaomi, best known for smartphones and electric vehicles, just announced itself as a serious contender in the foundational AI race. The company isn’t just building models; it’s building the infrastructure for agentic AI—systems that don’t just converse but act .
Luo Fuli made a promise: MiMo-V2 series models will be open-sourced when they are stable enough to be worth releasing. For a community still digesting the implications of DeepSeek’s open-weight strategy, that’s a significant commitment .
The Bigger Picture
Stealth model launches are not new. Pony Alpha appeared on OpenRouter in February 2026 and was revealed five days later as part of Zhipu AI’s GLM-5 system . The practice allows developers to gather unbiased feedback, test real-world performance, and build community interest before formal announcements .
What made Hunter Alpha different was the scale of speculation. DeepSeek V4 has been the most anticipated open-source AI release since its predecessor shocked the industry with training costs of just $5.6 million—a fraction of Western competitors’ spending . Any model that remotely resembles it was guaranteed to attract attention.
The Hunter Alpha episode also reveals something about the current state of AI development. The field has become so specialized, so dependent on a relatively small pool of talent, that “reasoning style” has become a kind of fingerprint. Developers can look at how a model thinks and guess who trained it .
Luo Fuli’s fingerprints were all over Hunter Alpha. The difference is that she’s now working for Xiaomi.
What Happens Next
MiMo-V2-Pro is available now via API at platform.xiaomimimo.com, with limited free trials integrated into popular agent frameworks like OpenClaw and Cline .
DeepSeek V4, meanwhile, remains forthcoming. When it arrives—possibly as early as April—it will enter a market that looks different than it did just weeks ago. Xiaomi just served notice that the Chinese AI race is no longer a one-player game .
For developers, this competition is a gift. Two major models with similar architectures, different optimizations, and aggressive pricing will accelerate the entire ecosystem.
For the rest of us, it’s a reminder that in AI, a week is an eternity. A mysterious model appeared. The internet speculated. A company claimed its creation. And the technology keeps moving forward.
The next time a “stealth model” appears on OpenRouter, maybe we’ll wait before guessing its origins. Or maybe we’ll enjoy the mystery while it lasts.