40% of AI Datacenters Are Behind Schedule. The Bottleneck Isn't the GPUs.
AI demand exploded (320x in one year). But energy, memory, and costs are creating three walls at once. Those who could read the numbers moved already. For everyone else, the bill is coming.
TL;DR
AI demand exploded (reasoning token consumption +320x in one year, Anthropic ARR +1,400% annually), but 40% of US datacenters planned for 2026 are behind schedule.
The problem isn’t GPUs: it’s electrical transformers (5-year lead times), RAM memory (60% of demand still uncovered through 2027), and the bill (GitHub Copilot closed new signups, Anthropic raised costs silently).
Those who could read the numbers moved already. For everyone else, the bill is coming.
Act 1. The Demand Never Stops Exploding
On April 20, 2026, GitHub Copilot did something rare: it closed the gates. New Pro plan signups blocked. Opus access removed. VP Martin Woodward explained: “cost sustainability for millions of users.”
It wasn’t a routine decision.
The same day, in San Francisco, Anthropic celebrated a new milestone: $30 billion annualized revenue. Three months prior it was $14 billion. Nine months before that, $4 billion. Eighteen months ago, $1 billion.
This isn’t growth. This is an explosion.
OpenAI, with 1 million paying companies, registered a 320x increase in reasoning token consumption per enterprise customer in twelve months. Not 20x. Not 50x. Three hundred and twenty times. Business messages increased 8x.
Yet 40% of AI datacenters planned in the US for 2026 are behind schedule.
This makes no sense.
Or maybe it does.
Act 2. The Three Walls
What’s happening isn’t a story about capacity. It’s a story about time.
WALL 1. The Physical: Electrical Transformers and Years of Delay
Start where you don’t expect.
In 2020, an electrical transformer (the component that regulates grid voltage, not the AI model) had a lead time of 24-30 months from order to delivery. These were standard parts, bus-sized, manufactured in series.
Today that lead time is 3-5 years.
This isn’t an exception. It’s the norm. And the market can’t absorb it: 2,300 gigawatts of demand is already queued, waiting for grid interconnection in the US.
The Big Four, Alphabet, Amazon, Meta, Microsoft, committed over $650 billion to infrastructure by 2026. This isn’t hype. This is capital burned in concrete, steel, and circuits.
Amazon alone will invest $200 billion in infrastructure in 2026, more than the annual budget of entire nations. It also signed an agreement with Anthropic: $33 billion total in equity, $100 billion in AWS services over 10 years with 5 gigawatts guaranteed. In other words: Amazon isn’t betting on Anthropic, it’s building it.
Yet even Amazon couldn’t save Stargate.
Stargate was the plan: $500 billion joint OpenAI-Oracle venture to build the largest AI datacenter in history. 600 megawatts initially. Planned expansion to 2 gigawatts. In March 2026, 600 MW was canceled at Abilene, Texas. The site will remain at 1.2 gigawatts. Meta entered to claim the freed space.
This isn’t failure. It’s defeat against time and permitting.
Deloitte estimates US AI datacenter power grows from 4 gigawatts in 2024 to 123 gigawatts by 2035. Thirty times in eleven years. BloombergNEF revised the forecast upward to 106 gigawatts in the US by 2035, a 36% increase from the prior estimate.
But the power ceiling isn’t controlled by Nvidia. It’s controlled by permitting authorities, power generators, transformer suppliers.
Microsoft solved part of this problem radically: a 20-year agreement with Constellation Energy for 837 megawatts from Three Mile Island nuclear reactor, operational by 2028. Google locked in 3 long-term power purchase agreements for 1.2 gigawatts of carbon-free generation over 20 years. Oracle is contracting 1 gigawatt from SMR (small modular reactors). Big Tech already committed over 10 gigawatts of nuclear power, with 25% of their incremental 2030 demand met by behind-the-meter generation, not the public grid.
They’re bypassing the system because the system can’t handle it.
WALL 2. The Silicon: There Isn’t Enough Memory
While servers wait for transformers, chips wait for memory.
HBM (High Bandwidth Memory, the ultra-fast memory that AI GPUs need) is the Bordeaux of chips. Margins are 3-5x above commodity RAM. But it’s also the bottleneck: 1 gigabyte of HBM requires 4x the manufacturing capacity of standard RAM. And Nvidia has purchasing power equivalent to a large smartphone maker.
Result: memory producers cover only 60% of demand through 2027, per TrendForce. Counterpoint Research is more brutal: “meaningful supply expansion will be difficult until the second half of 2027,” analyst Min-sung Hwang wrote.
Two and a half years of structural scarcity.
Meanwhile, prices explode. DDR5 retail went from $133 in September 2025 to $513 in January 2026 for a 2x16GB DDR5-6000 CL30 kit. A 286% increase in four months. In Europe the same kit rose from 150 euros to 569 euros, 279% up.
This isn’t inflation. It’s a supply crisis.
Server-grade DRAM wholesale climbed 60% in Q1 2026. Year-over-year, conventional DRAM rose 171%; DDR5 spot market quadrupled since September 2025.
In concrete numbers: a server with 64GB DDR5 RDIMM (Registered Dual Inline Memory Module) costing X in January costs 2.5X in April. Even 8GB DDR4 notebook memory saw prices jump 180% quarter-over-quarter. LPDDR5X, the power-efficient memory for mobile devices, grew 130% quarterly.
Yet the real problem isn’t price. It’s delivery.
An enterprise customer starting an AI project today knows one thing: they won’t have all the memory they need in 2026. They’ll have it in 2028, maybe.
And by then, operating costs will have transformed the economics.
WALL 3. The Economic: The Model Breaks
This is where the story becomes concrete.
Anthropic priced Claude Opus 4.6 at $5 per million input tokens and $25 per million output tokens.
GitHub Copilot cost $10 per month. Pro plan, up to 100 Claude Opus requests per month included.
The math is clear: with a user running 10 intensive sessions monthly, the vendor already loses money. With 100 sessions, unremarkable for a developer using a coding agent all day, the economics explodes.
That’s why on April 20, 2026 GitHub Copilot closed the gates.
But there’s a detail even more revealing than this.
In early March 2026, Anthropic silently reduced the cache time-to-live (TTL) for Claude Code from 1 hour to 5 minutes. Cache is the mechanism that reduces costs when reusing the same information within an hour: ask about your repository, then ask again in 30 minutes, the second query costs a fraction of the first. With TTL cut to 5 minutes, that advantage vanishes.
Result: user costs rose 20-32%.
How do we know? A developer analyzed his API call JSONL logs and posted the results on GitHub issue #46829. The community checked. The numbers hold. Anthropic didn’t deny it.
Three weeks of silence.
Then Anthropic tried something even more aggressive: testing with roughly 2% of new users, removing Claude Code from the Pro plan ($20 monthly) to Max plans ($100-200 monthly). A move cutting 90% of users who don’t pay for Max.
The public reaction was harsh. The test was withdrawn after days.
But the message was unmistakable: the “AI included in plan” model doesn’t work anymore.
For enterprise users, the shift is even more radical. Anthropic rolled out enterprise plans with simple structure: seat fee for platform access, zero tokens included. All consumption billed at standard API rates. Legacy plans migrate to this model at renewal. Per The Information (April 2026), costs are rising for high-intensity users.
Reality Check
So far we’ve talked numbers. But the numbers tell a simple story: the three walls, physical (transformers, 5 years), silicon (memory through 2027), economic (the model breaks when intensive users pay real prices), are closing at the same time.
These aren’t separate problems. They’re symptoms of the same phenomenon: AI demand curve grew 320x in one year. Supply grows in a straight line.
When an exponential curve meets a straight line, the result is called a “cliff”: the point where scarcity shifts from theoretical to operational.
We’re months away from that cliff.
Consulting
Want to understand how AI can actually work for your business, beyond the hype? From strategy to implementation, I help companies turn artificial intelligence into real results. Explore my AI Consulting services or reach out directly for a discovery call.
Act 3. Cold Pragmatism
The question everyone asks: “What do I do?”
First: understand this isn’t a crisis. It’s recalibration.
Those who designed operations for scarcity (Amazon with nuclear, Microsoft with Three Mile Island, Google with multi-decade agreements) already won. Not because they’re smarter. Because they read the data when the market was convinced infinite GPUs would solve everything.
Those who built economics on “AI included in price” (GitHub, some SaaS players) are hastily recalibrating toward usage-based billing. Uncomfortable, but it’s the only model surviving when marginal cost of delivery exceeds price.
Those who hired with “one model solves everything” mentality will discover that same competency, knowing how to use Claude Opus, that had unlimited value a year ago will hold value only while the model stays cheap. When costs rise 10x, value halves, because it halves the number of operations economically worth automating.
The final point most companies miss: AI prices won’t fall because supply and demand find equilibrium. AI prices will rise because operating costs (power, memory, compute) are climbing. And when operating costs rise, you don’t get to keep prices low. You choose to stop offering the service, or you raise price.
OpenAI, Anthropic, Mistral aren’t bad actors. They’re just operationalizing what was hidden during the hyperscale phase.
The bill was coming. Now it’s here.
Course
Want to learn how to use AI in your work, without depending on updates and without following courses that expire every six months? I built a structured, updatable-by-design program. Discover the From user to orchestrator course.
Context and Deep Dives
If you want to understand how these numbers connect to strategies the big players are running, I’ve written several pieces addressing specific parts of this map:
OpenAI and Real Costs: I analyzed how OpenAI is running 14 billion dollar losses in 2026 while Anthropic is building the alternative profitability model.
AI Stack and Cost Audit: If you’re a company using multiple AI models, I created an AI stack audit to measure security and costs in 30 minutes.
AI Agents and Business Implications: I documented how McKinsey forecasts a trillion in value from AI agents, but only players controlling the stack have defensible positions.
Defensible Positions in the AI Landscape: And finally, I mapped the hyperscaler compression strategies and who actually has defensible positions in this scenario.



“Amazon alone will invest $200 billion in infrastructure in 2026, more than the annual budget of entire nations. It also signed an agreement with Anthropic: $33 billion total in equity, $100 billion in AWS services over 10 years with 5 gigawatts guaranteed. In other words: Amazon isn’t betting on Anthropic, it’s building it.”
really did your research with this article. it shows! awesome stuff. great read all around.
Two things are true at once.
The article is right: transformers take 5 years, HBM covers 60% of demand through 2027, and the "AI included in your plan" model is dead. Real constraints. Real money.
But there's a second layer. The frontier is capacity-constrained. The long tail isn't.
A fine-tuned 7B running on a single GPU for ~$30 doesn't need Stargate. It doesn't need Three Mile Island. It runs today on hardware that already exists. Most enterprise problems - classification, extraction, structured RAG, voice matching - never needed Opus in the first place.
So both are true: frontier AI is hitting a physical ceiling, and most teams relying on it never needed to be there.