China's GLM-5: The 744B Open-Source Model Trained Entirely on Huawei Chips

DS
LDS Team
Let's Data Science
14 min readAudio
Listen Along
0:00 / 0:00
AI voice

Zhipu AI released a 744-billion parameter model trained on 100,000 Huawei Ascend chips with zero NVIDIA dependency, then open-sourced it under an MIT license. It is the first frontier AI model to prove that China can build competitive intelligence without American hardware.

By LDS Team

February 26, 2026

On February 11, 2026, a Beijing-based AI company most Americans have never heard of did something that the entire U.S. semiconductor sanctions regime was designed to prevent. Zhipu AI -- now rebranded as Z.ai -- released GLM-5, a 744-billion parameter language model that performs within single digits of GPT-5.2 and Claude Opus 4.5 on major benchmarks -- and it was trained entirely on Huawei's Ascend 910B processors. Not a single NVIDIA chip was involved.

Then they open-sourced it under the MIT license. Anyone on Earth can download it, modify it, and deploy it commercially with zero restrictions.

The release landed like a shockwave across the AI industry. Within 24 hours, Zhipu AI's stock on the Hong Kong Stock Exchange surged 28.7%. Within days, the model's weights were being downloaded across HuggingFace and ModelScope by developers worldwide. And in Washington, the announcement reignited a debate that has been simmering since DeepSeek's R1 rattled markets in January 2025: are U.S. chip export controls actually working?

The answer, as of February 2026, is more complicated than anyone in the Commerce Department would like to admit.

What GLM-5 actually is

GLM-5 is a Mixture-of-Experts (MoE) model. That means it contains 744 billion total parameters, but only activates 40 billion of them for any given input -- routing each token through 8 of its 256 specialized expert sub-networks. This architecture, pioneered at scale by models like Mixtral and DeepSeek-V3, allows GLM-5 to deliver frontier-class performance at a fraction of the computational cost of a dense model of equivalent capability.

SpecificationGLM-5
Total parameters744 billion
Active parameters40 billion (per token)
ArchitectureMixture-of-Experts (256 experts, 8 active)
Layers80
Context window200,000 tokens
Max output length131,072 tokens
Training data28.5 trillion tokens
Training hardware100,000 Huawei Ascend 910B
Training frameworkMindSpore (Huawei)
LicenseMIT (fully permissive)
API pricing (input)$1.00 / 1M tokens
API pricing (output)$3.20 / 1M tokens
Release dateFebruary 11-12, 2026

The pricing alone is significant. At $1.00 per million input tokens and $3.20 per million output tokens, GLM-5 is significantly cheaper than GPT-5.2 or Claude Opus 4.6 for API access. For enterprises processing millions of documents daily, that is not a marginal savings -- it is a category shift.

The model supports text, image, video, and audio inputs. It can generate text and images as outputs. It handles documents up to 200,000 tokens -- roughly 500 pages -- in a single context window.

The architecture beneath the benchmarks

GLM-5's technical innovations go beyond simply being large. It adopts Multi-head Latent Attention (MLA) from DeepSeek-V2, which compresses key-value pairs into a latent space to slash memory during inference. DeepSeek Sparse Attention (DSA) dynamically selects which tokens to attend to, enabling the 200K context window without prohibitive compute costs. And Multi-token Prediction (MTP), using three additional prediction layers, achieves an average acceptance length of 2.76 tokens per step -- effectively tripling output generation speed per forward pass. Together, these techniques partially compensate for the raw performance gap between Huawei's Ascend chips and NVIDIA's latest hardware.

How it performs

GLM-5's benchmark results tell a nuanced story. It is genuinely competitive at the frontier -- but it is not uniformly dominant.

BenchmarkGLM-5GPT-5.2Claude Opus 4.5Gemini 3 Pro
SWE-bench Verified77.8%80.0%80.9%76.2%
AIME 2026 I92.7%--93.3%90.6%
AIME 2025 I88.7%100%----
GPQA-Diamond86.0%92.4%87.0%91.9%
HLE (with tools)50.4%------
AA Intelligence Index50+------

The standout result is SWE-bench Verified, the industry's benchmark for real-world software engineering. GLM-5's 77.8% makes it the highest-scoring open-source model on this benchmark -- trailing GPT-5.2 (80.0%) and Claude Opus 4.5 (80.9%) by only 2-3 percentage points. For an open-weight model trained on domestic Chinese hardware, that gap is remarkably narrow.

On GPQA-Diamond, a graduate-level science reasoning benchmark, GLM-5's 86.0% is strong but trails GPT-5.2's 92.4% and Gemini 3 Pro's 91.9%. On mathematical reasoning, the picture is mixed. GLM-5 scores 92.7% on AIME 2026 I, a competitive result. But on AIME 2025 I, it scores 88.7% compared to GPT-5.2's perfect 100%. On Humanity's Last Exam (HLE), a benchmark specifically designed to resist AI performance, GLM-5 with tools scored 50.4% -- a result that, if independently verified, would represent a significant advance.

GLM-5 is also the first open-source model to score above 50 on the Artificial Analysis Intelligence Index v4.0, a composite benchmark that aggregates performance across multiple evaluation suites. That milestone matters because it places an openly available model in territory previously reserved for proprietary systems behind API paywalls.

Worth noting: Independent benchmark verification remains an ongoing concern in the AI industry. Chinese labs have faced scrutiny over benchmark practices, and some researchers have noted that self-reported scores should be treated with caution until reproduced by third parties. As of late February 2026, several independent evaluations are underway but not yet published.

Trained on 100,000 Huawei chips

This is the detail that transforms GLM-5 from an impressive model release into a geopolitical event.

GLM-5 was trained on a cluster of 100,000 Huawei Ascend 910B processors -- chips designed by Huawei's HiSilicon subsidiary and manufactured by Semiconductor Manufacturing International Corporation (SMIC), China's largest chipmaker, using a 7-nanometer process. No NVIDIA GPUs were used at any stage of training. No AMD chips. No Intel accelerators. The entire training stack -- hardware, framework, and infrastructure -- is Chinese.

The Ascend 910B delivers approximately 320 TFLOPS of FP16 performance. For reference, that places it between NVIDIA's A100 (312 TFLOPS) and the H100 (989 TFLOPS FP16 dense). It is a capable chip, but it is not competitive with NVIDIA's current generation on raw throughput.

ChipProcessFP16 PerformanceMemoryStatus
Huawei Ascend 910BSMIC 7nm (DUV)~320 TFLOPS64GB HBM2eUsed for GLM-5 training
NVIDIA A100TSMC 7nm312 TFLOPS80GB HBM2eExport-banned to China (Oct 2022)
NVIDIA H100TSMC 4nm989 TFLOPS (FP16 dense)80GB HBM3Export-banned to China (Oct 2023)
NVIDIA H200TSMC 4nm989 TFLOPS (FP16 dense)141GB HBM3eAllowed with conditions (Jan 2026)
Huawei Ascend 910CSMIC 7nm (DUV)~800 TFLOPS128GB HBM3In production, ~30% yield

What Zhipu AI accomplished was not just an engineering challenge of training a large model. It was a systems integration challenge of making 100,000 chips -- each individually less powerful than their NVIDIA counterparts -- work together reliably enough to complete a training run of 28.5 trillion tokens without the kind of failures that would force a restart.

This is where the SMIC manufacturing constraints become relevant. SMIC produces the Ascend 910B using DUV (deep ultraviolet) lithography, not the EUV (extreme ultraviolet) lithography that TSMC and Samsung use for their most advanced nodes. DUV at 7nm requires multi-patterning -- exposing each layer of the chip multiple times -- which reduces yield rates. Industry estimates put SMIC's 7nm yield at 30-50%, compared to TSMC's 90%+ at the same node. Every chip in Zhipu AI's 100,000-unit cluster was produced under these constraints.

The training framework was Huawei's MindSpore, an open-source deep learning platform that serves as China's answer to PyTorch. MindSpore was purpose-built for the Ascend hardware ecosystem, providing the compiler optimizations and distributed training capabilities needed to coordinate 100,000 chips efficiently.

The reinforcement learning pipeline

GLM-5's post-training is built around "Slime," an asynchronous reinforcement learning framework named after the foraging behavior of slime molds. It runs over 1,000 concurrent rollouts in parallel, decoupling generation from training to increase RL throughput by an order of magnitude. The model then passes through three RL stages -- reasoning (math, coding), agentic (tool use, web browsing), and general alignment -- with each stage feeding into the next through on-policy cross-stage distillation.

The results show in hallucination reduction. GLM-5 scores -1 on the AA-Omniscience Index, where 0 represents perfect calibration between confidence and accuracy. The previous generation model GLM-4.7 scored -36. That cross-generation leap -- from significantly overconfident to nearly perfectly calibrated -- is one of the largest documented gains in hallucination reduction by any lab.

Where DeepSeek failed, Zhipu succeeded

The significance of GLM-5's Huawei-only training becomes clearer in context. DeepSeek, the lab behind R1 that rattled global markets in January 2025, reportedly attempted to train its successor R2 on Huawei Ascend hardware. The effort failed. DeepSeek encountered stability issues that made large-scale Ascend training runs unreliable, and ultimately reverted to NVIDIA GPUs for R2.

If China's most technically accomplished AI lab could not make Huawei hardware work for training at scale, the Ascend ecosystem appeared unready for frontier development. GLM-5's successful training on 100,000 Ascend chips directly contradicts that conclusion.

The contrast extends to geopolitics. The U.S. government has alleged that DeepSeek trained its models on NVIDIA chips obtained in violation of export controls -- specifically, Blackwell GPUs that should never have reached China. DeepSeek denies this. Zhipu AI faces no such questions. Their entire training stack is domestically sourced, and the MIT license means anyone can verify it.

From Tsinghua lab to $40 billion company

Zhipu AI was founded in 2019 as a spinout from Tsinghua University's Knowledge Engineering Group (KEG), one of China's most prestigious AI research labs, by professors Tang Jie and Li Juanzi. CEO Zhang Peng leads commercial operations. The deep Tsinghua ties gave the company access to talent, government relationships, and early funding that pure startups lacked -- the company raised approximately $1.5 billion pre-IPO from Alibaba, Tencent, Meituan, Xiaomi, and Saudi Aramco's venture arm Prosperity7.

On January 8, 2026, Zhipu AI went public on the Hong Kong Stock Exchange (2513.HK), raising $558 million at a $6.5 billion valuation. The offering was oversubscribed 1,159 times. By mid-February, the stock had surged over 500%, pushing market capitalization past $40 billion -- more than six times the IPO valuation. On the day GLM-5 was announced, shares surged 28.7%.

The timeline

2019
Zhipu AI Founded
Spun out of Tsinghua University's Knowledge Engineering Group by professors Tang Jie and Li Juanzi. Zhang Peng becomes CEO. Initial focus on knowledge graph technology and NLP research.
October 2022
U.S. Chip Sanctions Begin
The Biden administration imposes sweeping export controls on advanced semiconductors to China. NVIDIA's A100 and H100 GPUs are banned. Chinese AI labs begin scrambling for alternatives.
October 2023
Sanctions Tighten
The U.S. closes the H800/A800 loophole -- NVIDIA's China-specific chips designed to comply with initial restrictions. Huawei's Ascend chips become the primary domestic alternative.
May 2024
China Launches Big Fund III
China establishes a \$47.5 billion semiconductor investment fund -- the largest in the country's history -- to accelerate domestic chip production and reduce dependence on foreign technology.
January 27, 2025
The DeepSeek Shock
DeepSeek releases R1, a reasoning model competitive with OpenAI's o1. NVIDIA loses \$589 billion in market capitalization in a single day. The event forces a global reassessment of China's AI capabilities.
January 8, 2026
Zhipu AI IPOs on HKEX
Zhipu AI lists on the Hong Kong Stock Exchange (2513.HK), raising \$558 million at a \$6.5 billion valuation. The offering is oversubscribed 1,159 times.
January 2026
Trump Reverses Course on Chip Exports
The Trump administration partially eases Biden-era chip export controls, allowing NVIDIA H200 GPU sales to China under strict conditions including case-by-case review and a 25% surcharge. The policy shift comes weeks before GLM-5 demonstrates that China may no longer need them.
February 11-12, 2026
GLM-5 Released
Zhipu AI releases GLM-5 -- a 744B parameter model trained entirely on Huawei hardware -- under MIT license. The model performs within single digits of GPT-5.2 and Claude Opus 4.5 on major benchmarks. Zhipu's stock surges 28.7%.

The sanctions question

GLM-5 arrives at a moment when U.S. chip export controls are under more scrutiny than at any point since they were enacted in October 2022. Through successive rounds of restrictions -- banning A100/H100 exports, closing the H800 loophole in October 2023, targeting HBM in December 2024 -- the U.S. sought to keep China a generation behind. Then, on January 14, 2026, the Trump administration partially reversed course, allowing NVIDIA H200 sales to China under strict conditions: case-by-case BIS review, volume caps, mandatory third-party testing, and a 25% surcharge.

GLM-5's release three weeks later provided ammunition for both sides. Hawks argue that sanctions forced China to build domestic alternatives -- and that GLM-5 proves those alternatives now work, accelerating the very outcome sanctions were meant to prevent. Doves argue that if China can train frontier models on domestic hardware regardless, the primary effect has been to shrink NVIDIA's China revenue -- which fell to roughly $17 billion in FY2025 -- while providing minimal strategic benefit.

The data tells its own story. Huawei captured 35-40% of China's AI chip market by late 2025, up from near zero in 2022. China mandated 50% domestic chips in public-sector data centers. Big Fund III is pouring $47.5 billion into the domestic supply chain. Alibaba's Qwen model family overtook Meta's Llama as the most-downloaded on HuggingFace, per Stanford's HAI AI Index. And the next-generation Ascend 910C -- delivering roughly 800 TFLOPS, about 80% of the H100 -- is already in production. China's domestic AI chip ecosystem has crossed a threshold of self-sufficiency it is unlikely to retreat from.

The weaknesses nobody is ignoring

GLM-5 is not an unqualified success. Its inference throughput, while competitive at a median of around 61 tokens per second per Artificial Analysis, falls behind the fastest proprietary deployments -- and the gap widens on Ascend hardware specifically. On mathematics, while scoring 92.7% on AIME 2026 I, it manages only 88.7% on AIME 2025 I compared to GPT-5.2's perfect 100%. On Terminal-Bench, a benchmark for autonomous command-line task completion, GLM-5 reportedly underperforms against Claude and GPT-5.2 -- a meaningful gap in an era where AI agents are the primary commercial focus.

Some of GLM-5's self-reported scores, particularly on HLE with tools, have not yet been independently verified. And even if Ascend chips can train frontier models, SMIC's 30-50% yield rate at 7nm DUV means Zhipu likely needed 200,000-330,000 total dies for 100,000 working chips -- an inefficiency that translates into higher costs and slower scaling.

What this means for the AI industry

Open-source approaches parity. GLM-5 under MIT license delivers performance within striking distance of GPT-5.2 and Claude Opus 4.5 at significantly lower API pricing. Combined with DeepSeek R2, Llama 4, and Qwen, the open-weight ecosystem is collectively approaching the frontier -- putting real pressure on proprietary providers to justify their premium.

NVIDIA is no longer the only path. GLM-5 proves that frontier models can be trained without American hardware. Any country or organization seeking AI capabilities now has a proof-of-concept using Huawei's Ascend ecosystem. The chips are imperfect, but they work for the most demanding workloads in AI.

China is playing the open-source long game. By releasing GLM-5 under MIT -- even more permissive than Meta's Llama license, which restricts companies above 700 million MAU -- Zhipu AI builds global developer dependence on Chinese-originated technology. The strategic logic is deliberate, and for Western enterprises navigating compliance around Huawei Entity List restrictions and Chinese data governance, it creates an uncomfortable calculation between capability and geopolitics.

The Bottom Line

Three years after the United States imposed semiconductor export controls designed to keep China at least a generation behind in AI capability, a Beijing company trained a frontier model on 100,000 domestically manufactured chips and gave it away for free.

GLM-5 is not the best model in the world on every benchmark. It is slower than its competitors, its math performance trails GPT-5.2, and some of its claimed scores await independent verification. But these caveats miss the point.

The point is that it exists at all.

A model that competes with GPT-5.2 on software engineering benchmarks. Trained entirely on Huawei chips manufactured by SMIC at a 7-nanometer process node that was supposed to be beyond China's reach. Running on a domestic software stack with zero dependency on any American technology. Released under the most permissive open-source license available, for anyone to use.

DeepSeek showed that Chinese labs could build frontier models efficiently. GLM-5 shows they can do it without NVIDIA. That is a fundamentally different statement about the state of global AI competition -- and about the limits of technology denial as a geopolitical strategy.

Whether that matters more to the engineers downloading the model weights tonight or to the policymakers who will have to reconsider their assumptions about semiconductor leverage, the answer is the same: the world changed a little on February 11, 2026, and it is not changing back.

Sources