Premium

"Agent Development Hasn't Accelerated The Way We Expected": What Zuckerberg's AI Agent Confession Means For Token Demand

For roughly a month, the most important chart in the market had been quietly diverging while the AI complex partied on.

The Silicon Data LLM Token Expenditure Index - a rough proxy for what the world is actually paying to run AI - peaked on May 31 and rolled over hard. The GS US Broad AI basket - Goldman's index of AI-linked stocks - kept levitating for another four weeks, because narratives don't read charts. Then came Wednesday's Bloomberg report that Meta is standing up a cloud business to sell its ~~overbuild~~ excess compute - which, as we detailed yesterday, monkeyhammered the neoclouds (the pure-play GPU landlords like CoreWeave and Nebius), hit the chip and memory names, and sent META up 9% - and suddenly the red line remembered the blue one exists.

Today courtesy of Reuters, we glean some insight from the trenches.

According to the report, Meta CEO Mark Zuckerberg told an internal town hall that AI agent development over the last four months 'hasn't accelerated in the way we expected,' that the 2026 reorganization was, by his own telling, not the cleanest, and that the bets made around it have yet to come to fruition. Long-term trends, we are assured, remain aligned with the basic shape of the company.

To see why an internal product update and a market index are telling the same story, start with the arithmetic.

Total token spend = price times volume, and price has been in freefall. The cost of getting the same quality of AI output has been falling on the order of 10x per year (a16z calls it "LLMflation"; by Epoch AI's count, it halves roughly every two months). Meanwhile, every CFO in America spent this spring doing to AI bills what they did to cloud bills a decade ago: routing easy queries to cheaper models, caching answers instead of paying for the same one twice, pushing non-urgent work into half-price batch jobs, and setting hard token budgets - the great institutional crackdown on tokenmaxxing. Against that avalanche of falling prices, total spend only grows if usage grows even faster. And the usage story - the thing justifying the $700-billion-plus Big Tech is shoveling into data centers this year - was agents: AI that doesn't just answer a question and stop, but fires off ten or twenty model calls per task and runs around the clock.

The agents were supposed to eat the tokens - according to a Goldman May 8 deep dive, agentic AI was supposed to lead to "120 Quadrillion Tokens Monthly By 2030"- but they haven't shown up to the party yet and Meta's AI adventures have decidedly gone off the rails:

In May, Zuckerberg told employees the layoffs were a straight trade against infrastructure - "we basically have two major cost centers," compute and people - while the company guided 2026 capex to $125-145 billion, roughly double last year's.
In June, an internal memo conceded the company had made mistakes in its AI transformation, and would almost certainly make more.
In July: the agents haven't accelerated, the reorg wasn't clean, the bets haven't paid off.

That is three consecutive months of expectations management, all landing just ahead of the Q2 earnings print.

Which brings us back to Goldman's 1-Delta desk-head Rich Privorotsky - whose rubber band, regular readers will recall, has been stretching for a month - and whose note this morning cuts to the part everyone is missing. The debate over whether Meta can actually execute a cloud business (first-party vs. third-party infrastructure, scale, capacity) is, in his telling, beside the point. The point is reflexivity - the observation that market prices don't just reflect the fundamentals, they feed back into them.

META had underperformed the Nasdaq by roughly 30% YTD before the cloud narrative surfaced - almost exactly the underperformance that preceded the metaverse capitulation in late 2022, a reversal the market went on to reward by roughly tripling the stock inside a year. The headline over META selling excess compute, Privo argues, functioned as a trial balloon at the exact moment markets began asking whether AI capex can run indefinitely without a visible return. Whether Meta ever rents out a single GPU is secondary: if management appears willing to monetize idle infrastructure, the market reads capital discipline and pays for it; if they double down without a monetization path, the market punishes them. And to the argument that this is all a scheme to buy room to spend even more, his answer is that it isn't about what management wants - it's about what the market will allow. If reflexivity holds, the course is now set toward some form of capex discipline.

The tape agrees so far: META was up nearly 9% on Monday, while CoreWeave and Nebius were down double digits, semis and memory bleeding out the other side of the ledger.

But - and here is where the 2022 analogy needs a footnote - when Meta capitulated on the metaverse, nobody upstream got hurt. Reality Labs' losses were Meta's alone. AI capex is different: Meta's "discipline" is Nvidia's order book, Micron's memory demand, CoreWeave's and Nebius's rental contracts, and the power-purchase agreements of a queue of energy developers. In 2022 the capitulation was contained; in 2026 it is contagious - it transmits straight down the supply chain. Which is why Tuesday's move was a rotation within the AI trade (platforms over picks-and-shovels), not an exit from it. However, at some point the seemingly endless credit firehose that is funding the trillions in projected capex will end. That's when the bottom will fall out of the AI bubble.

There is also an inconvenient wrinkle for the discipline bulls: Zuckerberg has repeatedly framed the option to sell excess capacity as the thing that gives him confidence to keep overbuilding. "It's definitely on the table," he told shareholders in May - on the table as an option, a floor under the capex, not a ceiling on it. Bulls will further note that Meta was, at the very same time, reportedly signing for another ~1.6 gigawatts of data-center capacity from Crusoe and getting throttled by Google Cloud for consuming more than contracted - the "it's generational churn, not a glut" defense, in which Meta quietly sells its stale silicon while hoarding the frontier stuff. (The frontier stuff would presumably host Muse Spark, the flagship model from the most expensive research team ever assembled - which, per the WSJ, still has no developer launch date. The pitch, apparently: rent access to a model that hasn't shipped.)

Both readings can be true, and for the clearing price it hardly matters. Bernstein's Madison Rezaei estimates Meta is sitting on roughly 20 gigawatts of accumulated capacity with another 14 gigawatts on the way - a footprint that rivals the actual hyperscalers. When the marginal buyer of compute becomes a marginal seller, rental prices do what rental prices do - which is presumably why the ORNN H100 index (a tracker of the going rental rate on Nvidia's workhorse AI chip) was rolling over well before any of this hit the tape.

For the neoclouds, this is where reflexivity meets solvency math. Their anchor tenant just announced he might become their competitor - one whose ad machine can subsidize a price war indefinitely, and whose CEO intends, per his own framing, to sell capacity at a premium to cost. That works exactly as long as scarcity persists. Selling capacity is, of course, how scarcity stops persisting. The announcement is worth more than the business will ever be - which, per Privorotsky, is precisely the point of announcing it.

Now for the wrinkle that won't get quote-tweeted: the blue line (on the chart up top) stopped falling three weeks ago. The token expenditure index bottomed in mid-June and has been grinding higher since - meaning equities have spent the past few sessions catching down to a signal that already stabilized. The next print on that index matters far more than the last one. A lower high - a bounce that stalls out below the May peak - confirms the demand story is genuinely cracking. A V-recovery says this was a digestion phase - a rerun of the 2023 "cloud optimization" scare, when enterprises trimmed their cloud bills, everyone called the top of the buildout, and then generative AI detonated every capex model on the Street - or simply the lull between model generations that always looks like a demand cliff from the outside.

What to watch from here: Meta's capex language at the late-July earnings print - where, per UBS, a merely reiterated guide now reads hawkish (spend without ROI gets punished), while a trim gets META rewarded and the supply chain shot. Heads, semis lose; tails, semis lose - unless the token data reaccelerates first. Then Nvidia's August guide, the ORNN rental index, memory-chip spot prices, and - above all - whether a second hyperscaler starts using the word "monetize" about its own capacity.

Today's Top Stories