NVIDIA Just Shipped the Most Powerful AI Chip Ever Made

DS
LDS Team
Let's Data Science
14 min readAudio
Listen Along
0:00 / 0:00
AI voice

The company reported a record $68.1 billion quarter, shipped its first Vera Rubin samples to customers, and gave guidance suggesting the AI infrastructure boom is still accelerating.

By LDS Team

February 25, 2026

On February 25, 2026, NVIDIA did two things that would have been unthinkable five years ago. First, it reported $68.1 billion in revenue for a single quarter -- the largest in the history of the semiconductor industry. Then, almost as a footnote on the earnings call, CFO Colette Kress mentioned something that mattered even more: "We shipped our first Vera Rubin samples to customers earlier this week."

Vera Rubin is NVIDIA's next-generation AI chip platform. It is the successor to Blackwell, which has been the engine behind the current AI infrastructure boom. If Blackwell was the chip that proved AI could be a hundred-billion-dollar business, Vera Rubin is the one NVIDIA is betting will make it a trillion-dollar one.

The timing is deliberate. NVIDIA's GTC 2026 conference kicks off March 16 in San Jose. CEO Jensen Huang has promised to unveil something that will "surprise the world." The samples shipping this week are a preview -- a signal to the market, to competitors, and to every company building AI infrastructure: the next generation is here.

What Vera Rubin Actually Is

Vera Rubin is not just a GPU. It is a six-chip platform -- the most complex NVIDIA has ever built.

The platform is named after Vera Florence Cooper Rubin (1928-2016), the American astronomer who provided the first strong observational evidence for dark matter by studying galaxy rotation curves in the 1970s. NVIDIA has a tradition of naming GPU architectures after scientists -- from Tesla and Fermi to Ada Lovelace and Grace Hopper. Rubin continues the recent trend of honoring women who transformed their fields.

The six components:

  • Rubin GPU -- The compute engine. Built on TSMC's 3nm process, with 336 billion transistors packed across two reticle-sized compute chiplets and two I/O dies.
  • Vera CPU -- An 88-core custom Arm processor designed specifically to pair with the Rubin GPU. It is the first CPU to natively support FP8 precision.
  • NVLink 6 -- The interconnect fabric, delivering 3.6 TB/s per GPU -- enough bandwidth to make dozens of GPUs behave like a single massive processor.
  • ConnectX-9 SuperNIC, BlueField-4 DPU, NVLink 6 Switch, and Spectrum-6 Ethernet Switch -- The networking components that tie the system together at rack scale.

When assembled into NVIDIA's NVL72 configuration -- 72 Rubin GPUs and 36 Vera CPUs in a single rack -- the system delivers 3.6 exaflops of FP4 compute and 260 TB/s of internal bandwidth. NVIDIA says that bandwidth figure exceeds the entire internet's current capacity.

CNBC, which received an exclusive first look at the physical hardware, reported that each Vera Rubin system contains 1.3 million components from more than 80 suppliers across 20 countries.

Worth noting: "Vera Rubin" refers to the combined CPU-GPU superchip. "Vera" is the CPU. "Rubin" is the GPU. They connect via NVLink-C2C at 1.8 TB/s -- double the bandwidth of the previous Grace Blackwell pairing.

The Specs

Here is what NVIDIA has revealed, compared to the current-generation Blackwell B200:

SpecVera RubinBlackwell (B200)Improvement
Inference (FP4)50 PFLOPS~10 PFLOPS5x
Training (FP4)35 PFLOPS~10 PFLOPS3.5x
Memory288 GB HBM4192 GB HBM3e1.5x
Memory Bandwidth22 TB/s8 TB/s2.8x
Transistors336 billion208 billion1.6x
NVLink Bandwidth3.6 TB/s per GPU1.8 TB/s per GPU2x
Process NodeTSMC 3nm (N3P)TSMC 4nm--
TDP (reported)~2,300W1,200W--

The Vera CPU brings 88 custom "Olympus" Arm cores with 176 threads via Spatial Multithreading, up to 1.5 TB of LPDDR5X memory, and 1.2 TB/s of memory bandwidth. Its performance is roughly double the Grace CPU it replaces.

In the full NVL72 rack -- 72 GPUs, 36 CPUs, one enclosure -- the system delivers 3.6 exaflops of FP4 compute, 20.7 TB of total HBM4 memory, and 260 TB/s of scale-up bandwidth. NVIDIA claims up to 10x lower cost per token and 4x fewer GPUs needed for mixture-of-experts training compared to Blackwell.

Worth noting: The reported ~2,300-watt TDP per GPU (per analyst estimates; NVIDIA has not officially confirmed this figure) is nearly double Blackwell's. Data centers will need significant infrastructure upgrades to run Vera Rubin at scale. NVIDIA claims the system-level efficiency improvements offset the raw power increase, but the absolute power draw is a real constraint for deployment.

The Biggest Quarter in Semiconductor History

The Vera Rubin sample shipment came on the same day NVIDIA reported financial results that broke its own records.

MetricQ4 FY2026Year-Over-Year
Revenue$68.1 billion+73%
Data Center Revenue$62.3 billion+75%
Net Income$43.0 billion~+94%
EPS (adjusted)$1.62+82%
Q1 FY2027 Guidance$78.0 billionBeat estimates by $5.4B

For the full fiscal year 2026 (ended January 2026), NVIDIA reported $215.9 billion in total revenue, up 65% year-over-year. Data center alone accounted for $193.7 billion -- roughly 90% of the total. Hyperscalers represent just over half of data center revenue.

The quarterly acceleration is striking. Q1: $44.1 billion. Q2: $46.7 billion. Q3: $57.0 billion. Q4: $68.1 billion. Each quarter larger than the last, with no sign of deceleration. The Q1 FY2027 guidance of $78 billion -- beating analyst estimates by $5.4 billion -- suggests the trend is continuing.

NVIDIA's market capitalization sits at approximately $4.7 trillion, making it the most valuable company in the world. Its order backlog exceeds $500 billion and continues to grow as customers place full-year orders for Vera Rubin.

Huang framed the economics bluntly on the earnings call: "Compute is revenues. Without compute, there is no way to generate tokens. Without tokens, there's no way to grow revenues."

Kress added: "We expect every cloud model builder to deploy Vera Rubin."

The Road to Vera Rubin

March 18, 2025
GTC 2025: The Reveal
Jensen Huang shows the physical Vera Rubin Superchip for the first time. He announces the full roadmap -- Rubin in H2 2026, Rubin Ultra in H2 2027, Feynman in 2028 -- and says reasoning and agentic AI have created "easily 100 times more" compute demand than expected a year prior.
Late 2025
Tape-Out and Fabrication
Both the Rubin GPU and Vera CPU complete tape-out and enter TSMC's 3nm fabrication line. SK Hynix begins ramping HBM4 memory production, though NVIDIA's decision to raise per-pin speed requirements above 11 Gbps pushes the HBM4 capacity ramp from Q2 to Q3 2026.
January 5, 2026
CES 2026: "In Full Production"
Jensen Huang announces that Vera Rubin is "in full production" at TSMC. The NVL72 rack system is officially launched. NVIDIA also reveals Rubin Ultra details: NVL576 racks with 576 GPUs delivering 15 exaflops of FP4 compute, due H2 2027.
February 25, 2026
First Samples Ship
CFO Colette Kress confirms on the Q4 earnings call that NVIDIA shipped its first Vera Rubin samples to customers earlier that week. The company reports record quarterly revenue of \$68.1 billion. Production shipments remain on track for H2 2026.
March 16, 2026
GTC 2026: "Surprise the World"
NVIDIA's annual GPU Technology Conference opens in San Jose. Jensen Huang delivers the keynote. He has promised to unveil a chip that will "surprise the world." Over 700 sessions are planned across four days.

Everyone Wants One

The list of confirmed Vera Rubin deployment partners reads like a directory of the world's most valuable technology companies.

Cloud providers (first wave, H2 2026): AWS, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, CoreWeave, Lambda, Nebius, and Nscale.

AI labs: Meta has committed to deploying "millions of Blackwell and Rubin GPUs." Anthropic will train and run inference on Vera Rubin systems. OpenAI, xAI, Mistral AI, Cohere, and Perplexity are all expected adopters.

Infrastructure partners: Dell, HPE, Lenovo, Cisco, and Supermicro will build server systems around the platform.

Microsoft is planning deployments of "hundreds of thousands of Vera Rubin Superchips" across its Fairwater AI superfactory sites.

The spending commitments behind these deployments are staggering. The four largest hyperscalers have collectively guided for $635-665 billion in capital expenditure for 2026:

Company2026 CapEx (Guided)
Amazon~$200 billion
Alphabet/Google$175-185 billion
Microsoft~$145 billion (analyst estimate)
Meta$115-135 billion

That is a 67-74% increase over 2025 levels. Roughly three-quarters of it -- around $450 billion -- is directly tied to AI infrastructure: servers, GPUs, and data centers.

Worth noting: Amazon's AI infrastructure spending is so aggressive that analysts project the company will run negative free cash flow of $17-28 billion in 2026. The hyperscalers are increasingly turning to debt markets to fund AI capex, transforming what were historically cash-rich businesses into leveraged ones. The question of whether this spending generates proportional returns is one nobody can answer yet.

The Competition Is Real

NVIDIA holds an estimated 86-95% of the AI training chip market depending on the measure. But for the first time, credible alternatives are emerging on multiple fronts.

AMD is closest. At CES 2026, AMD unveiled Helios -- its direct rack-scale competitor to the NVL72. The MI400 series chips inside it feature 432 GB of HBM4 memory (50% more than Rubin's 288 GB), 19.6 TB/s bandwidth, and 40 PFLOPS of FP4 compute. AMD is targeting Helios shipments for H2 2026 -- potentially before Vera Rubin reaches volume production. Oracle has committed to 50,000 MI450 series chips, and OpenAI has partnered with AMD on a massive 6-gigawatt, $90 billion-plus data center and computing infrastructure deal.

Custom silicon is the bigger threat. Every major hyperscaler is now building its own AI chips:

CompanyCustom ChipStatus
GoogleTPU Trillium (v6)Generally available. Anthropic signed the largest TPU deal in Google's history.
AmazonTrainium2 / Trainium3Trainium2 deployed (~500K chips). Trainium3 ramping early 2026.
MicrosoftMaia 200Announced January 2026 on TSMC 3nm. Claims 3x Trainium3 inference performance.
MetaMTIA v2 / v3v3 due H2 2026. Targets 35%+ of Meta's inference fleet on custom silicon by year-end.

Custom ASIC shipments are projected to grow 44.6% in 2026, versus 16.1% growth for GPUs. Analysts project custom AI server ASICs could surpass GPU shipments by 2028.

Intel has effectively exited. The company cancelled its Falcon Shores data center GPU in January 2025 after failing to gain meaningful traction with Gaudi 3. Its replacement, Jaguar Shores, is not expected until late 2026 at the earliest. Intel is not a factor in the AI accelerator race.

But one number explains why NVIDIA is not panicking: CUDA has over 4 million developers and thousands of optimized applications. The switching cost is enormous. And in February 2026, Meta -- despite years of investment in its own MTIA chips -- signed a deal to buy millions more GPUs from NVIDIA anyway.

Worth noting: The custom silicon trend cuts both ways. Google's TPUs power Gemini. Amazon's Trainium runs Anthropic's Claude. But both Google and Amazon remain massive NVIDIA customers. Custom chips are supplementing NVIDIA, not replacing it -- at least not yet. The real question is whether that changes when custom ASICs reach performance parity, which some analysts project could happen by 2028.

The Bigger Picture

The AI infrastructure buildout is now the largest technology investment in history.

The combined capital expenditure of the four largest hyperscalers in 2026 -- $635-665 billion -- exceeds the GDP of most countries. Jensen Huang has framed it as the beginning, not the peak. At CES, he described AI infrastructure as an $85 trillion opportunity over the next 15 years and denied that the current spending represents a bubble.

There is evidence on both sides.

The demand for AI compute is genuinely explosive. NVIDIA has a $500 billion-plus backlog that keeps growing. Inference costs are falling fast enough to unlock entirely new applications. Agentic AI -- systems that take autonomous actions, not just generate text -- is creating what Huang calls "easily 100 times more" compute demand than the industry expected a year ago. At the GTC 2025 keynote, he argued that the shift from single-shot answers to multi-step reasoning has fundamentally changed the math on how much compute the world needs.

But hyperscalers are taking on unprecedented debt to fund infrastructure that may not generate proportional revenue for years. Custom silicon threatens NVIDIA's pricing power. And the history of technology is littered with infrastructure booms that ended in correction -- from fiber optics in 2000 to crypto mining rigs in 2018.

NVIDIA's answer to this is speed. Its one-year cadence -- Blackwell (2024), Vera Rubin (2026), Rubin Ultra (2027), Feynman (2028) -- is designed to make the competition irrelevant before it arrives. Rubin Ultra will scale to NVL576 racks with 576 GPUs, delivering 15 exaflops of FP4 compute with up to 1 TB of HBM4e memory per GPU. By the time competitors match Vera Rubin, NVIDIA plans to be two generations ahead.

As one industry analyst put it: "If NVIDIA maintains this cadence, it will be even more difficult for competitors to catch up."

The Bottom Line

NVIDIA just reported the largest quarter in semiconductor history and shipped the first samples of the most powerful AI chip ever made. Its market cap stands at $4.7 trillion. Its backlog exceeds half a trillion dollars. And its guidance says growth is accelerating, not slowing.

Vera Rubin is a genuine generational leap: 5x the inference performance of Blackwell, 2.8x the memory bandwidth, the first GPU to use HBM4, and an entirely new 88-core Arm CPU designed from scratch to pair with it. Every major cloud provider, every major AI lab, and every major infrastructure partner has signed up to deploy it. The question is not whether Vera Rubin will sell. It is whether NVIDIA can make enough of them.

The competition is more credible than it has ever been. AMD's Helios could ship before Vera Rubin reaches volume. Custom ASICs from Google, Amazon, Microsoft, and Meta are growing nearly three times faster than GPUs. And the $635 billion in hyperscaler capex for 2026 suggests the market may be big enough for multiple winners.

But NVIDIA's annual release cadence, the CUDA ecosystem's 4 million developers, and a $500 billion backlog create a moat that nobody has come close to breaching. GTC 2026 is three weeks away. Jensen Huang has promised to "surprise the world."

Given what he just shipped, that is a remarkable thing to say.

Sources