Zhipu AI released a 744-billion parameter model trained on 100,000 Huawei Ascend chips with zero NVIDIA dependency, then open-sourced it under an MIT license. It is the first frontier AI model to prove that China can build competitive intelligence without American hardware.
By LDS Team
February 26, 2026
On February 11, 2026, a Beijing-based AI company most Americans have never heard of did something that the entire U.S. semiconductor sanctions regime was designed to prevent. Zhipu AI -- now rebranded as Z.ai -- released GLM-5, a 744-billion parameter language model that performs within single digits of GPT-5.2 and Claude Opus 4.5 on major benchmarks -- and it was trained entirely on Huawei's Ascend 910B processors. Not a single NVIDIA chip was involved.
Then they open-sourced it under the MIT license. Anyone on Earth can download it, modify it, and deploy it commercially with zero restrictions.
The release landed like a shockwave across the AI industry. Within 24 hours, Zhipu AI's stock on the Hong Kong Stock Exchange surged 28.7%. Within days, the model's weights were being downloaded across HuggingFace and ModelScope by developers worldwide. And in Washington, the announcement reignited a debate that has been simmering since DeepSeek's R1 rattled markets in January 2025: are U.S. chip export controls actually working?
The answer, as of February 2026, is more complicated than anyone in the Commerce Department would like to admit.
What GLM-5 actually is
GLM-5 is a Mixture-of-Experts (MoE) model. That means it contains 744 billion total parameters, but only activates 40 billion of them for any given input -- routing each token through 8 of its 256 specialized expert sub-networks. This architecture, pioneered at scale by models like Mixtral and DeepSeek-V3, allows GLM-5 to deliver frontier-class performance at a fraction of the computational cost of a dense model of equivalent capability.
| Specification | GLM-5 |
|---|---|
| Total parameters | 744 billion |
| Active parameters | 40 billion (per token) |
| Architecture | Mixture-of-Experts (256 experts, 8 active) |
| Layers | 80 |
| Context window | 200,000 tokens |
| Max output length | 131,072 tokens |
| Training data | 28.5 trillion tokens |
| Training hardware | 100,000 Huawei Ascend 910B |
| Training framework | MindSpore (Huawei) |
| License | MIT (fully permissive) |
| API pricing (input) | $1.00 / 1M tokens |
| API pricing (output) | $3.20 / 1M tokens |
| Release date | February 11-12, 2026 |
The pricing alone is significant. At $1.00 per million input tokens and $3.20 per million output tokens, GLM-5 is significantly cheaper than GPT-5.2 or Claude Opus 4.6 for API access. For enterprises processing millions of documents daily, that is not a marginal savings -- it is a category shift.
The model supports text, image, video, and audio inputs. It can generate text and images as outputs. It handles documents up to 200,000 tokens -- roughly 500 pages -- in a single context window.
The architecture beneath the benchmarks
GLM-5's technical innovations go beyond simply being large. It adopts Multi-head Latent Attention (MLA) from DeepSeek-V2, which compresses key-value pairs into a latent space to slash memory during inference. DeepSeek Sparse Attention (DSA) dynamically selects which tokens to attend to, enabling the 200K context window without prohibitive compute costs. And Multi-token Prediction (MTP), using three additional prediction layers, achieves an average acceptance length of 2.76 tokens per step -- effectively tripling output generation speed per forward pass. Together, these techniques partially compensate for the raw performance gap between Huawei's Ascend chips and NVIDIA's latest hardware.
How it performs
GLM-5's benchmark results tell a nuanced story. It is genuinely competitive at the frontier -- but it is not uniformly dominant.
| Benchmark | GLM-5 | GPT-5.2 | Claude Opus 4.5 | Gemini 3 Pro |
|---|---|---|---|---|
| SWE-bench Verified | 77.8% | 80.0% | 80.9% | 76.2% |
| AIME 2026 I | 92.7% | -- | 93.3% | 90.6% |
| AIME 2025 I | 88.7% | 100% | -- | -- |
| GPQA-Diamond | 86.0% | 92.4% | 87.0% | 91.9% |
| HLE (with tools) | 50.4% | -- | -- | -- |
| AA Intelligence Index | 50+ | -- | -- | -- |
The standout result is SWE-bench Verified, the industry's benchmark for real-world software engineering. GLM-5's 77.8% makes it the highest-scoring open-source model on this benchmark -- trailing GPT-5.2 (80.0%) and Claude Opus 4.5 (80.9%) by only 2-3 percentage points. For an open-weight model trained on domestic Chinese hardware, that gap is remarkably narrow.
On GPQA-Diamond, a graduate-level science reasoning benchmark, GLM-5's 86.0% is strong but trails GPT-5.2's 92.4% and Gemini 3 Pro's 91.9%. On mathematical reasoning, the picture is mixed. GLM-5 scores 92.7% on AIME 2026 I, a competitive result. But on AIME 2025 I, it scores 88.7% compared to GPT-5.2's perfect 100%. On Humanity's Last Exam (HLE), a benchmark specifically designed to resist AI performance, GLM-5 with tools scored 50.4% -- a result that, if independently verified, would represent a significant advance.
GLM-5 is also the first open-source model to score above 50 on the Artificial Analysis Intelligence Index v4.0, a composite benchmark that aggregates performance across multiple evaluation suites. That milestone matters because it places an openly available model in territory previously reserved for proprietary systems behind API paywalls.
Worth noting: Independent benchmark verification remains an ongoing concern in the AI industry. Chinese labs have faced scrutiny over benchmark practices, and some researchers have noted that self-reported scores should be treated with caution until reproduced by third parties. As of late February 2026, several independent evaluations are underway but not yet published.
Trained on 100,000 Huawei chips
This is the detail that transforms GLM-5 from an impressive model release into a geopolitical event.
GLM-5 was trained on a cluster of 100,000 Huawei Ascend 910B processors -- chips designed by Huawei's HiSilicon subsidiary and manufactured by Semiconductor Manufacturing International Corporation (SMIC), China's largest chipmaker, using a 7-nanometer process. No NVIDIA GPUs were used at any stage of training. No AMD chips. No Intel accelerators. The entire training stack -- hardware, framework, and infrastructure -- is Chinese.
The Ascend 910B delivers approximately 320 TFLOPS of FP16 performance. For reference, that places it between NVIDIA's A100 (312 TFLOPS) and the H100 (989 TFLOPS FP16 dense). It is a capable chip, but it is not competitive with NVIDIA's current generation on raw throughput.
| Chip | Process | FP16 Performance | Memory | Status |
|---|---|---|---|---|
| Huawei Ascend 910B | SMIC 7nm (DUV) | ~320 TFLOPS | 64GB HBM2e | Used for GLM-5 training |
| NVIDIA A100 | TSMC 7nm | 312 TFLOPS | 80GB HBM2e | Export-banned to China (Oct 2022) |
| NVIDIA H100 | TSMC 4nm | 989 TFLOPS (FP16 dense) | 80GB HBM3 | Export-banned to China (Oct 2023) |
| NVIDIA H200 | TSMC 4nm | 989 TFLOPS (FP16 dense) | 141GB HBM3e | Allowed with conditions (Jan 2026) |
| Huawei Ascend 910C | SMIC 7nm (DUV) | ~800 TFLOPS | 128GB HBM3 | In production, ~30% yield |
What Zhipu AI accomplished was not just an engineering challenge of training a large model. It was a systems integration challenge of making 100,000 chips -- each individually less powerful than their NVIDIA counterparts -- work together reliably enough to complete a training run of 28.5 trillion tokens without the kind of failures that would force a restart.
This is where the SMIC manufacturing constraints become relevant. SMIC produces the Ascend 910B using DUV (deep ultraviolet) lithography, not the EUV (extreme ultraviolet) lithography that TSMC and Samsung use for their most advanced nodes. DUV at 7nm requires multi-patterning -- exposing each layer of the chip multiple times -- which reduces yield rates. Industry estimates put SMIC's 7nm yield at 30-50%, compared to TSMC's 90%+ at the same node. Every chip in Zhipu AI's 100,000-unit cluster was produced under these constraints.
The training framework was Huawei's MindSpore, an open-source deep learning platform that serves as China's answer to PyTorch. MindSpore was purpose-built for the Ascend hardware ecosystem, providing the compiler optimizations and distributed training capabilities needed to coordinate 100,000 chips efficiently.
The reinforcement learning pipeline
GLM-5's post-training is built around "Slime," an asynchronous reinforcement learning framework named after the foraging behavior of slime molds. It runs over 1,000 concurrent rollouts in parallel, decoupling generation from training to increase RL throughput by an order of magnitude. The model then passes through three RL stages -- reasoning (math, coding), agentic (tool use, web browsing), and general alignment -- with each stage feeding into the next through on-policy cross-stage distillation.
The results show in hallucination reduction. GLM-5 scores -1 on the AA-Omniscience Index, where 0 represents perfect calibration between confidence and accuracy. The previous generation model GLM-4.7 scored -36. That cross-generation leap -- from significantly overconfident to nearly perfectly calibrated -- is one of the largest documented gains in hallucination reduction by any lab.
Where DeepSeek failed, Zhipu succeeded
The significance of GLM-5's Huawei-only training becomes clearer in context. DeepSeek, the lab behind R1 that rattled global markets in January 2025, reportedly attempted to train its successor R2 on Huawei Ascend hardware. The effort failed. DeepSeek encountered stability issues that made large-scale Ascend training runs unreliable, and ultimately reverted to NVIDIA GPUs for R2.
If China's most technically accomplished AI lab could not make Huawei hardware work for training at scale, the Ascend ecosystem appeared unready for frontier development. GLM-5's successful training on 100,000 Ascend chips directly contradicts that conclusion.
The contrast extends to geopolitics. The U.S. government has alleged that DeepSeek trained its models on NVIDIA chips obtained in violation of export controls -- specifically, Blackwell GPUs that should never have reached China. DeepSeek denies this. Zhipu AI faces no such questions. Their entire training stack is domestically sourced, and the MIT license means anyone can verify it.
From Tsinghua lab to $40 billion company
Zhipu AI was founded in 2019 as a spinout from Tsinghua University's Knowledge Engineering Group (KEG), one of China's most prestigious AI research labs, by professors Tang Jie and Li Juanzi. CEO Zhang Peng leads commercial operations. The deep Tsinghua ties gave the company access to talent, government relationships, and early funding that pure startups lacked -- the company raised approximately $1.5 billion pre-IPO from Alibaba, Tencent, Meituan, Xiaomi, and Saudi Aramco's venture arm Prosperity7.
On January 8, 2026, Zhipu AI went public on the Hong Kong Stock Exchange (2513.HK), raising $558 million at a $6.5 billion valuation. The offering was oversubscribed 1,159 times. By mid-February, the stock had surged over 500%, pushing market capitalization past $40 billion -- more than six times the IPO valuation. On the day GLM-5 was announced, shares surged 28.7%.
The timeline
The sanctions question
GLM-5 arrives at a moment when U.S. chip export controls are under more scrutiny than at any point since they were enacted in October 2022. Through successive rounds of restrictions -- banning A100/H100 exports, closing the H800 loophole in October 2023, targeting HBM in December 2024 -- the U.S. sought to keep China a generation behind. Then, on January 14, 2026, the Trump administration partially reversed course, allowing NVIDIA H200 sales to China under strict conditions: case-by-case BIS review, volume caps, mandatory third-party testing, and a 25% surcharge.
GLM-5's release three weeks later provided ammunition for both sides. Hawks argue that sanctions forced China to build domestic alternatives -- and that GLM-5 proves those alternatives now work, accelerating the very outcome sanctions were meant to prevent. Doves argue that if China can train frontier models on domestic hardware regardless, the primary effect has been to shrink NVIDIA's China revenue -- which fell to roughly $17 billion in FY2025 -- while providing minimal strategic benefit.
The data tells its own story. Huawei captured 35-40% of China's AI chip market by late 2025, up from near zero in 2022. China mandated 50% domestic chips in public-sector data centers. Big Fund III is pouring $47.5 billion into the domestic supply chain. Alibaba's Qwen model family overtook Meta's Llama as the most-downloaded on HuggingFace, per Stanford's HAI AI Index. And the next-generation Ascend 910C -- delivering roughly 800 TFLOPS, about 80% of the H100 -- is already in production. China's domestic AI chip ecosystem has crossed a threshold of self-sufficiency it is unlikely to retreat from.
The weaknesses nobody is ignoring
GLM-5 is not an unqualified success. Its inference throughput, while competitive at a median of around 61 tokens per second per Artificial Analysis, falls behind the fastest proprietary deployments -- and the gap widens on Ascend hardware specifically. On mathematics, while scoring 92.7% on AIME 2026 I, it manages only 88.7% on AIME 2025 I compared to GPT-5.2's perfect 100%. On Terminal-Bench, a benchmark for autonomous command-line task completion, GLM-5 reportedly underperforms against Claude and GPT-5.2 -- a meaningful gap in an era where AI agents are the primary commercial focus.
Some of GLM-5's self-reported scores, particularly on HLE with tools, have not yet been independently verified. And even if Ascend chips can train frontier models, SMIC's 30-50% yield rate at 7nm DUV means Zhipu likely needed 200,000-330,000 total dies for 100,000 working chips -- an inefficiency that translates into higher costs and slower scaling.
What this means for the AI industry
Open-source approaches parity. GLM-5 under MIT license delivers performance within striking distance of GPT-5.2 and Claude Opus 4.5 at significantly lower API pricing. Combined with DeepSeek R2, Llama 4, and Qwen, the open-weight ecosystem is collectively approaching the frontier -- putting real pressure on proprietary providers to justify their premium.
NVIDIA is no longer the only path. GLM-5 proves that frontier models can be trained without American hardware. Any country or organization seeking AI capabilities now has a proof-of-concept using Huawei's Ascend ecosystem. The chips are imperfect, but they work for the most demanding workloads in AI.
China is playing the open-source long game. By releasing GLM-5 under MIT -- even more permissive than Meta's Llama license, which restricts companies above 700 million MAU -- Zhipu AI builds global developer dependence on Chinese-originated technology. The strategic logic is deliberate, and for Western enterprises navigating compliance around Huawei Entity List restrictions and Chinese data governance, it creates an uncomfortable calculation between capability and geopolitics.
The Bottom Line
Three years after the United States imposed semiconductor export controls designed to keep China at least a generation behind in AI capability, a Beijing company trained a frontier model on 100,000 domestically manufactured chips and gave it away for free.
GLM-5 is not the best model in the world on every benchmark. It is slower than its competitors, its math performance trails GPT-5.2, and some of its claimed scores await independent verification. But these caveats miss the point.
The point is that it exists at all.
A model that competes with GPT-5.2 on software engineering benchmarks. Trained entirely on Huawei chips manufactured by SMIC at a 7-nanometer process node that was supposed to be beyond China's reach. Running on a domestic software stack with zero dependency on any American technology. Released under the most permissive open-source license available, for anyone to use.
DeepSeek showed that Chinese labs could build frontier models efficiently. GLM-5 shows they can do it without NVIDIA. That is a fundamentally different statement about the state of global AI competition -- and about the limits of technology denial as a geopolitical strategy.
Whether that matters more to the engineers downloading the model weights tonight or to the policymakers who will have to reconsider their assumptions about semiconductor leverage, the answer is the same: the world changed a little on February 11, 2026, and it is not changing back.
Sources
- Zhipu AI Official Website (Company information, model details)
- GLM-5 Model Documentation -- Zhipu AI (Model specifications, benchmarks, architecture)
- Zhipu AI IPO Prospectus -- HKEX (2513.HK) (IPO details, financials, investor information)
- CNBC: Chinese AI startup Zhipu surges on Hong Kong debut (Jan 8, 2026)
- Bloomberg: US Warns That Using Huawei AI Chip Anywhere Breaks Its Rules (May 2025)
- SMIC Investor Relations (Manufacturing capabilities, financial data)
- Stanford HAI AI Index 2025 (Qwen/Llama download data, China AI ecosystem analysis)
- NVIDIA FY2025 10-K Filing (China revenue impact, export control disclosures)
- U.S. Bureau of Industry and Security -- Semiconductor Export Controls (Sanctions timeline, policy details)
- DeepSeek R1 Technical Report (DeepSeek architecture reference, comparison data)
- Artificial Analysis LLM Leaderboard (Composite benchmark rankings, inference speed data)
- GLM-5 on HuggingFace (zai-org) (Model card, official benchmark table)
- Digitimes: Huawei matches NVIDIA in China AI chip market (Jan 2026)