
A four-year-old still has more training data than GPT-5. We are spending civilizations to fix that.
In four years on this planet, a typical child has absorbed roughly 50 times more data than the largest language model ever trained. Meta’s chief AI scientist Yann LeCun, often called one of the godfathers of AI, ran the math in a widely circulated 2024 LinkedIn post: 16,000 waking hours of vision through two optical nerves works out to about 1014 bytes. The biggest LLMs are trained in roughly 1013. A toddler in a sandbox is, by any honest measure, a more efficient learning system than a $500 million GPU cluster.
That single statistic is the most uncomfortable fact in modern AI. It tells you the entire trillion-dollar buildout is a brute-force workaround for a problem nobody has actually solved: how to make machines that learn the way living things do. And it explains why the industry is now hurtling toward a financial reckoning that rhymes uncomfortably with WorldCom, Nortel, and the dark-fiber graveyard of 2002.
1. The data wall, and what it costs to climb
LeCun’s point isn’t that LLMs are useless; it’s that text alone is a starvation diet. By his estimate, all the quality text publicly available on the internet would take a human reading eight hours a day, 250 words a minute, 170,000 years to finish. The models have eaten it — YouTube transcripts, Reddit, the Library of Genesis, the comment sections of newspapers that haven’t existed since 2009. They are running out.
To keep the scaling laws alive, the answer is more: more parameters, more video, more synthetic data, more chips, more buildings to put the chips in. The bill is staggering.
Bain & Company’s 2025 Global Technology Report estimates that by 2030, global AI compute demand will hit 200 gigawatts, with the U.S. alone needing 100 GW of new capacity. A CNBC analysis using U.S. Department of Energy and Census Bureau data shows a single one-gigawatt data-center campus consumes more electricity than the city of San Francisco. America is being asked to build 100 San Franciscos of power, just for servers, in five years, on a grid whose total demand has been flat for two decades.
The International Energy Agency’s 2025 Energy and AI report projects global data-center electricity use will more than double to 945 terawatt-hours by 2030 — slightly more than Japan’s entire annual consumption. A January 2026 Bloom Energy report says U.S. data-center demand will jump from 80 GW in 2025 to 150 GW by 2028: the energy needs of Spain, added in three years. Goldman Sachs Research puts the global grid bill at $720 billion through 2030 just to keep the lights on inside the buildings.
The buildings themselves are mutating. According to Boston Consulting Group data cited by CNBC, the average hyperscaler data center used to be 40 MW. The new generation tops 500 MW; some pipelines now contemplate 2 GW campuses big enough to span a meaningful chunk of Manhattan. A Bloomberg News investigation found that two-thirds of the data centers built since 2022 sit in regions already classified as water-stressed.
2. The capex tsunami
The capital math is even more vertiginous than the kilowatt-hours. Goldman Sachs counted $368 billion in cumulative AI capex through August 2025. According to Motley Fool research compiled from company filings, the four U.S. hyperscalers — Microsoft, Amazon, Google, Meta — collectively spent $413 billion on AI infrastructure in 2025 alone, an 84 percent jump from 2024, and have guided to between $600 billion and $700 billion in 2026.
Zoom out and the numbers turn surreal. McKinsey & Company, in its April 2025 report “The Cost of Compute,” forecasts $5.2 trillion in global data-center investment by 2030. Brookfield Asset Management, which is underwriting much of it, projects $7 trillion across chips, power and connectivity over the next decade. Gartner expects worldwide AI spending to top $2 trillion in 2026.
In a September 2025 client note, Deutsche Bank’s George Saravelos put it bluntly: “In the absence of tech-related spending, the U.S. would be close to, or in, recession this year.” Roughly half the S&P 500’s gains in 2025 came from AI-linked stocks. Nvidia is, in the bank’s words, “carrying the weight of U.S. economic growth” — and to keep doing so, capital investment must remain “parabolic,” which Deutsche calls “highly unlikely.”
3. The $800 billion hole
Here is the part nobody on an earnings call wants to dwell on. The same Bain Global Technology Report asked a simple question: at $500 billion of annual data-center capex by 2030, what revenue does the AI industry need to actually pay for the buildout?
The answer is $2 trillion a year. Bain then ran the most generous monetization scenario it could construct — every enterprise migrates fully to the cloud, every productivity gain from AI in sales, marketing, support and R&D (about 20 percent of those budgets) is harvested and reinvested. Even in that fantasy, AI revenues fall $800 billion short.
That is not a rounding error. It is roughly the entire annual GDP of Switzerland — missing, every year, in perpetuity, on the most optimistic numbers anyone serious has been willing to put on paper.
And the demand side looks shakier by the month. MIT’s NANDA initiative, in its 2025 “State of AI in Business” report, studied 300 enterprise AI deployments and found that 95 percent of corporate generative-AI pilots produced no measurable financial return. Only 5 percent reached production. Companies have spent $30–40 billion on enterprise AI tools and most of it has vanished into what the report calls “pilot purgatory.” The technology works in demos. It struggles to survive contact with the average accounts-payable workflow.
So we are spending $500 billion a year to build infrastructure for a service whose customers cannot, in 95 cases out of 100, justify the subscription.
4. The dark-fiber rhyme
The historical analogue is not flattering. Between 1996 and 2001, telecom companies — driven by WorldCom’s now-infamous and entirely fictitious claim that internet traffic was doubling every 100 days — laid more than 80 million miles of fiber-optic cable across the United States, according to a Fortune retrospective marking the dot-com crash’s 25th anniversary. Vendor financing from Lucent and Nortel sweetened deals that customers couldn’t otherwise afford. As Janus Henderson Investors noted in an October 2025 analysis, Lucent at its peak carried over $15 billion in vendor financing against just $300 million of operating cash flow.
When the math finally caught up:
- Global telecom stocks lost over $2 trillion in market value between 2000 and 2002 (TheBubbleBubble historical analysis).
- WorldCom collapsed in what was then the largest bankruptcy in U.S. history, dragging $11 billion of accounting fraud into daylight.
- Global Crossing went under with $12.4 billion in debt; Nortel, once a third of the Toronto Stock Exchange by market cap, never recovered.
- Corning, the world’s largest fiber maker, fell from nearly $100 to about $1 a share and laid off 16,000 workers (Fortune, September 2025).
- By 2004, an estimated 85–95 percent of the fiber laid in the 1990s remained “dark” — unused.
The eerie part is the language. Read any AI infrastructure pitch deck and replace “gigawatts” with “bandwidth,” “GPU” with “router,” “hyperscaler” with “long-haul carrier,” and the prose becomes indistinguishable from a 1999 Lucent prospectus. The financial plumbing rhymes too: Nvidia’s $100 billion strategic investment in OpenAI — its largest customer — to fund 10 GW of data centers using Nvidia chips (announced September 2025) is functionally identical to Lucent’s vendor-financing-of-WorldCom playbook. Suppliers funding customers to buy supplies. We have seen this movie.
5. What survives?
None of this is to say AI is a fraud. It isn’t. The dot-com bust didn’t kill the internet; it killed the people who paid for the cable. The fiber stayed buried, and a decade later it carried Netflix, YouTube, and the cloud businesses that now, ironically, are funding the AI buildout.
Something similar will probably happen here. The data centers won’t vanish. “Dark compute” — capacity bought at peak hype — will eventually find a use, and somebody patient enough to buy it for ten cents on the dollar in 2029 will look like a genius in 2034.
But the people writing the cheques today are not buying that future at ten cents. They are buying it at full retail, financed by parabolic capex, justified by scaling laws LeCun himself thinks are a dead end, and underwritten by enterprise customers who in 95 cases out of 100 cannot make the pilot work.
Overcapacity, as one analyst put it after the fiber crash, always builds the next era — but it wipes out the investors who paid for it. The four-year-old in the sandbox already knows more than GPT-5. The question is who pays the $800 billion a year to pretend otherwise, and for how long.
History doesn’t repeat. But it rhymes — and right now it’s humming an awfully familiar tune.

