Dwarkesh Podcast

Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

Brief

AI compute growth is increasingly constrained not by headline hyperscaler spending, but by how far upstream the bottlenecks now sit in the semiconductor supply chain. Dylan Patel’s central claim is that the often-cited 2026 capex figures—roughly $600 billion across Amazon, Meta, Google, and Microsoft, and close to $1 trillion across the broader supply chain—should not be interpreted as compute coming online immediately. A large share is advance work: turbine deposits for 2028-29, data-center shells for 2027, land, power contracts, and other enabling infrastructure. On his numbers, the U.S. adds about 20 GW of incremental AI data-center capacity this year, while labs like Anthropic and OpenAI are already pushing against much larger future needs. Patel argues Anthropic in particular was too conservative about contracting for compute; because its revenue accelerated faster than expected, it now needs to scramble for capacity from lower-quality or more expensive providers, often through revenue-sharing arrangements with hyperscalers or through short-term spot-like deals.

That near-term scramble leads into Patel’s broader economic argument: in a supply-constrained world, compute committed early becomes a major competitive advantage. He cited H100 deals at up to $2.40 per GPU-hour for 2-3 years against a roughly $1.40/hour five-year deployed cost base, implying large supplier margins and sharp penalties for late buyers. This also changes the GPU depreciation debate. A bearish view says older GPUs should collapse in value as new chips deliver better performance per dollar. Patel’s rebuttal is that when new chips are themselves supply-constrained, the market prices older GPUs by their current utility, not by a neat benchmark-relative depreciation curve. If a newer model architecture can extract more useful output from the same H100 fleet than prior software generations could, the economic value of that installed base can rise rather than fall. In that framing, compute reservations function almost like long-dated options on future AI capability and revenue.

From there the conversation shifts to the “three big bottlenecks”: logic, memory, and the tools that make both. Patel’s most concrete claim is that by 2028-30 the ultimate limiter is ASML’s EUV lithography output. He walks through the arithmetic for 1 GW of Nvidia Rubin-class compute: about 55,000 3nm wafers, 6,000 5nm wafers, and 170,000 DRAM wafers, together requiring roughly 2 million EUV wafer passes, or about 3.5 EUV tools’ worth of capacity. Since ASML may only reach a bit above 100 EUV tools shipped per year by 2030, even aggressive AI buildouts eventually run into a hard manufacturing ceiling. The deeper point is that the key supply chains are extremely slow to expand. Fabs take 2-3 years to build; EUV tools are assembled from hyper-specialized subcomponents such as Carl Zeiss optics, wafer stages, and laser-produced plasma sources; and each expansion step requires scarce talent and precise industrial coordination. Patel repeatedly contrasts this with power and data centers, which, though difficult, are far simpler and faster to scale.

Memory is the second major pressure point, and Patel portrays it as both an AI and consumer-electronics story. He argues that HBM’s economics are brutal because it consumes 3-4x more DRAM wafer area per bit than commodity memory, yet AI accelerators need HBM because bandwidth, not just capacity, is often the real constraint. He quantifies HBM4 at about 2.5 TB/s per stack, versus only 64-128 GB/s for DDR5 over similar edge area, making “just use commodity DRAM” a poor substitute. The result is that roughly a third of 2026 big-tech AI capex may effectively be going to memory, while DRAM and NAND price increases ripple into phones and PCs. Patel predicts higher iPhone BOMs, falling low-end smartphone production, and broader consumer frustration as AI absorbs supply that once fed mass-market electronics. Here the speakers largely agree, though Dwarkesh pushes on whether slower, lower-bandwidth models could open a different hardware design point; Patel’s answer is that the market will still allocate scarce compute toward the highest-value, least price-sensitive uses.

On infrastructure beyond chips, Patel is strikingly bullish that power can be solved and skeptical that space data centers are relevant this decade. He thinks there are many more viable generation pathways than investors appreciate—combined-cycle turbines, aeroderivatives, reciprocating engines, ship engines, Bloom fuel cells, batteries, and renewables paired with storage—and that permitting and labor are serious but tractable relative to semiconductor constraints. Space, by contrast, still uses the same scarce chips while worsening deployment time, networking topology, cooling, and maintenance. The final geopolitical layer is that Taiwan remains an irreplaceable concentration of know-how and capacity. Patel thinks China will likely indigenize DUV and perhaps demonstrate EUV by 2030, but not produce it at ASML scale by then. His bottom-line strategic view is nuanced: if AI progress and monetization continue compounding quickly, the U.S. benefits from already having the leading labs, capital base, and compute buildout; if timelines stretch toward 2035, China’s efforts to vertically integrate semiconductor production become more threatening. In other words, the shape of AI progress determines not just which bottleneck matters, but which country benefits from surviving it.

Why it matters

Dylan Patel said the U.S. will add roughly 20 gigawatts of incremental AI data-center capacity in 2026, but hyperscaler capex is front-loaded for future years: Google’s reported roughly $180 billion includes turbine deposits for 2028-29, construction for 2027 sites, and power-purchase and land commitments made well ahead of chip installation.

Key details

  • Patel estimated Anthropic is around 2-2.5 GW today and needs to exceed 5 GW by the end of 2026 just to support revenue growth plus a flat R&D fleet; he said Anthropic could reach 5-6 GW via its own capacity and models served through Amazon Bedrock, Google Vertex, or Microsoft Foundry, while OpenAI will likely end the year slightly higher because it signed aggressive long-term compute deals earlier.
  • Patel argued that late compute buyers are paying sharply higher prices: he cited H100 deals as high as $2.40 per GPU-hour for 2-3 years, versus roughly $1.40/hour total cost to deploy Hopper across five years, implying much higher gross margins for suppliers and a real advantage for labs that locked in 5-year contracts early.
  • Patel’s core bottleneck thesis is that AI scaling shifts back from power and data centers to semiconductors: by 2028-30 the limiting factor becomes ASML EUV tools, which he said cost about $300-400 million each, ship at roughly 70 units this year, 80 next year, and only a bit above 100 annually by 2030 even under aggressive expansion.
  • Using Nvidia Rubin as an example, Patel said 1 GW of new AI compute requires about 55,000 3nm wafers, 6,000 5nm wafers, and 170,000 DRAM wafers, totaling roughly 2 million EUV wafer passes; at about 75 wafers/hour and ~90% uptime, that works out to around 3.5 EUV tools per gigawatt of Rubin-class compute.
  • Patel estimated the global installed EUV base could reach about 700 tools by 2030, enough in theory for roughly 200 GW of AI chips if fully allocated to AI, so Sam Altman’s stated goal of 1 GW/week by 2030—about 52 GW/year—would imply about 25% share of total advanced-chip fab output rather than an obviously impossible target.
Cleaned source text

title: Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

author: Dwarkesh Podcast

content_type: podcast

publication: Dwarkesh Podcast

published: 2026-03-13T16:00:42+00:00

source_url: https://api.substack.com/feed/podcast/190839917/e3c75a06762af10f374661e8af1d1af6.mp3

word_count: 30265

All right, this is the episode of my roommate teaches me semiconductors.

It's also the sendoff for this current set.

Yeah, you know, after you use it, I'm like, I can't use this again.

I got to get out here.

No sloppy seconds for Dorken.

Okay, Dylan is the CEO of semi-analysis.

Dylan, the burning question I have for you, if you add up the big four, Amazon,

meta, Google, Microsoft, their combined forecasted cabbacks that you published recently,

this year is $600 billion.

and given, you know, yearly prices of renting that compute, that would be like close to 50 gigawatts.

Now, obviously, we're not putting on 50 gigawatts this year.

So presumably that's paying for compute that is going to be coming online over the coming years.

So I have a question about how to think about the timeline around when that CAPEX comes online.

Similar question for the labs where, you know, Open AI just announced that they raised $110 billion.

Anthropic just announced they raised $30 billion.

and if you look at the compute that they have coming online this year,

you should tell me how much it is,

but is it not,

is it not another four gigawatts total that they'll have this year?

It feels like the cost to rent the compute that Open AI and Anthropic will have this year

to like sustain their compute spend at, you know, $10, $13 billion a gigawatt.

Those individual raises alone are like enough to cover their compute spend for the year.

And then this is not even including the revenue that they're going to earn this year.

So help me understand first.

First, when is the time scale at which the big tech CAPEX is actually coming online?

And two, what are the labs raising all this money for if like the yearly price of a one gigawatt data center is like $13 billion?

So when you talk about the CAPX of these hyperscalers, right, on the order of $600 billion, and you look at the cross the rest of the supply chain, gets you to on the order of a trillion dollars.

A portion of this is, you know, immediately for compute going online this year, right?

the chips and the other parts of CAPEX that do get paid this year.

But there's a lot of setup CAPEX as well, right?

So when we have, when we're talking about 20 gigawatts this year in America, roughly,

incremental.

Incremental added capacity.

A portion of this is not spent this year.

A portion of that CAPX has actually spent the prior year.

And so when you look at, hey, Google's got $180 billion.

Actually, a big chunk of that is spent on turbine deposits for 28 and 29.

A chunk of that is spent on data center construction for 27.

A chunk of that is spent on, you know, power purchasing agreements and down payments and all these other things that they're doing for further out into the future so that they can set up this super fast scaling, right?

And this applies to all the hyperscalers and other people in the supply chain.

And so, you know, 20 gigawatts roughly deployed this year, a big chunk of that being hypers, a chunk of not being.

And all of these companies, their biggest customers are Anthropic and Open AI.

Anthropic and Open AI are in the, you know, two gigawatt and, you know, two and a half gigawatt

and one and a half gigawatts roughly right now.

They're trying to scale too much larger, right?

If you look at what Anthropic has done over the last few months, you know, $4 billion,

six billion revenue added, and if we just draw a straight line, hey, yeah, they'll add another

$6 billion of revenue a month.

People would argue that's bearish and that they should go faster.

What that implies is that they're going to add $60 billion of revenue across the next

10 months, right? And $60 billion of revenue at the current gross margins that Anthropic had,

at least last reported by media, would imply that they have, you know, roughly $40 billion

of compute spend for that inference for that 60 bill of revenue. That 40 billion of compute at roughly

$10 billion a gigawatt rental cost means that they need to add four gigawatts of inference capacity

just to grow revenue. And that's saying that their research and development training fleet

stays flat, right?

So, you know, in a sense,

Anthropic needs to get to well above

5 gigawatts by the end of this year,

and it's going to be really tough for them to get there,

but it's possible.

Can I ask a question about that?

So if Anthropic was not on track

to have 5 gigawatts by the end of this year,

but it needs that to serve both

the revenue that's gone crazier than expected,

and maybe it's going to be even more than that,

plus the research and training to make sure its models

are good enough for next year,

how, where is that going to come from?

You know, Dario, when he was on your podcast,

podcast was very, very, like, conservative.

He's like, you know, I'm not going to go crazy on compute because if my revenue inflex