Roadmap to the Future

Jeff Brown
|
Mar 20, 2025
|
The Bleeding Edge
|
5 min read


The annual NVIDIA GPU Technical Conference, now known as GTC, has become the most important technology event of the year.

Yes, there is always going to be a bunch of marketing fluff like what we examined in yesterday’s Bleeding Edge – The “Killer App” for the Automotive Industry, but there is always substance. It’s as foundational as a roadmap for the future.

What’s happening in the world of computing and artificial intelligence (AI) is a picture-perfect manifestation of exponential growth. It’s happening so fast that it’s difficult to grok the implications.

100X Compute Demand

NVIDIA CEO Jensen Huang made the point this week at the GTC conference in his keynote. Data center capital expenditure (Capex) forecasts for the future have increased. Regardless of efficiency improvements, almost all previous forecasts have been too conservative.

By 2029, total data center capital expenditures will exceed $1 trillion a year.

Source: NVIDIA

Spend was around $250 billion in 2023. That’s a 4X increase in spend in six years from already elevated levels of investment.

But the far easier bullet point to understand is encapsulated in a great quote from Huang:

The amount of computation needed is easily 100 times more than we thought we needed at this time last year.

It’s a remarkable point. Not only because it’s true, but because an industry insider like Huang with access to incredible resources, grossly underestimated how much demand there was going to be from the industry.

Please, don’t let anyone fool you. Don’t listen to the AI doomers, de-growthers, or decels. It doesn’t matter if there are tariffs, taxes, or fair trade agreements. War or peace. Massive fraud and corruption in the U.S. government. Government efforts to eradicate it.

None of it will slow down this trend.

It will just keep accelerating.

And it’s not just about NVIDIA (NVDA). The media may portray the company as the “face” of the AI boom, but there is a long list of companies – large and small, public and private – that are moving just as fast as NVIDIA.

We should look at NVIDIA’s product roadmap simply as a representation of what is going to happen across the entire industry devoted to manifesting AI.

That includes semiconductors, electronic components, cooling systems, software, fiber optics, energy production, electricity storage, and robotics.

That roadmap is what keeps the industry waiting with bated breath. It’s a map of the future and one that Huang always drops towards the end of his keynote speeches at GTC.

Rather than waiting until the end of The Bleeding Edge, I’ll drop it right now:

NVIDIA’s Product Roadmap Through 2028 | Source: NVIDIA

Of course, without context, it doesn’t mean much. So that’s what we’ll figure out together. Blackwell, on the very left, is NVIDIA’s current bleeding-edge GPU architecture that is in mass production right now. NVIDIA literally can’t have enough of these GPUs manufactured to meet anywhere near the demand (Note: Blackwell is manufactured by Taiwan Semiconductor (TSM)). What a great problem to have as a business.

But as with all things exponential, it’s not about what’s happening today, it’s about what’s coming…

Introducing: The Rubin GPUs

NVIDIA Blackwell Ultra | Source: NVIDIA

In the second half of this year, NVIDIA is launching Blackwell Ultra NVL72, which is essentially the second generation of its bleeding-edge architecture that just went into production in the second half of last year.

Some might think, “Oh, not a big deal.” They’d be wrong.

Blackwell Ultra has 50% more memory and twice as much bandwidth as today’s Blackwell. And a single rack of Blackwell Ultra – shown above on the left – is capable of 1.1 exaflops of performance.

We should keep in mind that the third most powerful classical supercomputer in the world – Aurora at Argonne National Laboratory – runs at 1 exaflop. NVIDIA’s Blackwell Ultra is capable of 1.1 exaflops in a single rack.

To be fair, it’s not an apples-to-apples comparison as NVIDIA’s solutions are GPU-centric for AI applications, but it is indicative of the sheer scale of the performance.

And Blackwell Ultra is a big jump from what’s available today. But what about next year?

NVIDIA Vera Rubin Semiconductor | Source: NVIDIA

By the second half of 2026, NVIDIA will release its new GPU architecture, Vera Rubin, named after the scientist who discovered dark matter in our universe. Who knows what we’ll discover with this kind of horsepower?

A single rack of Vera Rubin GPUs will be capable of 3.6 exaflops of performance. That’s 3.3 times more powerful than a rack of Blackwell Ultra, all in just a year.

Has your jaw dropped yet? If not, how about by the second half of 2027?

NVIDIA Rubin Ultra Semiconductor | Source: NVIDIA

The Vera Rubin Ultra will be out, pumping at 15 exaflops of performance. That’s 13.6 times more powerful than the Blackwell Ultra out in the second half of this year. Think about that.

A 13.6 times performance increase in two years.

And if that wasn’t all enough, by 2028, NVIDIA will launch a new architecture named after physicist Richard Feynman, which will almost certainly be at least 30 times more powerful than Blackwell Ultra.

Our Path to Artificial Superintelligence

None of this is fiction. All of this roadmap is under various stages of development…including the Feynman GPUs. It takes years of work to design and bring to mass production these bleeding-edge semiconductors.

Perhaps an even more simple visual reference to help us grok the significance of what’s happening is for us to look at the difference in semiconductor design between the Blackwell GPU architecture and the Rubin GPU architecture. We don’t even need to understand semiconductor technology – the pictures tell the story.

NVIDIA Blackwell Semiconductor | Source: NVIDIA

Blackwell has 130 trillion transistors, 576 memory chips, and 2,592 Grace CPU cores.

And yet the Rubin GPU has 1,300 trillion transistors, 2,304 memory chips, and 12,672 Vera CPU cores.

NVIDIA Rubin Semiconductor | Source: NVIDIA

It’s almost night and day. Rubin has 10 times the transistors and 7.5 times as much memory. Yes, it’s a larger semiconductor, but the engineering feat required to boost performance that much in a single generation is extraordinary.

Huang shared another graph that I quite liked to visualize what’s happening. On the left is a graph comparing Hopper, which is the architecture prior to Blackwell. Blackwell provides a 68 times improvement in performance, and Rubin’s performance will be 900 times that of Hopper. Nine hundred!

And on the right side is a graph that shows the total cost of ownership (TCO) per unit of performance.

What it says is that the TCO per unit of compute will be only 3% of Hopper. A much easier way to think about it is that the cost of a Rubin GPU architecture will be only 3% of the same level of performance as a Hopper GPU architecture.

Does this mean that we’ll need fewer GPUs given the performance leap? Absolutely not. It means that operating Rubin GPUs will be far more efficient per unit of compute than Hopper or Blackwell.

All we have to remember is Huang’s quote I shared earlier. Compute demand is 100 times greater than what they thought was needed last year. The economics of running massive AI factories (data centers) dictates the necessity to continuously upgrade to the bleeding edge of semiconductor technology.

And it’s also why there will be gigawatt-scale AI factories. Which will lead to artificial superintelligence (ASI).

What a week…

Jeff


Want more stories like this one?

The Bleeding Edge is the only free newsletter that delivers daily insights and information from the high-tech world as well as topics and trends relevant to investments.