Twitter/X

Inference will be 10–50× the value of training; two inference types exist today…

Brief

The post by @cryptopunk7213 cites Ben (Stratechery) arguing that inference — projected at 10–50× training’s value — splits into answer inference (~90% today) and nascent agentic inference (~10%). When agentic workloads scale beyond today’s ~99% chatbot‑like agents, they’ll require high‑bandwidth memory, lots of compute, and low latency, favoring memory suppliers; Cerebras fits answer inference only while NVIDIA stays dominant.

Why it matters

Inference will be 10–50× the value of training; two inference types exist today: answer inference (~90% now) and agentic inference (~10%).

Key details

  • Agentic inference is nascent but will supply most future value; currently ~99% of agents behave like chatbots, and true agentic workloads will demand chips unlike NVIDIA GPUs—higher‑bandwidth memory, abundant compute, and low latency—making memory vendors the biggest beneficiaries.
  • Cerebras is suited only for answer inference and won’t serve agentic workloads, while NVIDIA will retain its throne for the foreseeable future.
Source evidence

phenomenal breakdown of what’s going on with inference right now. ben nailed the shift we’re undergoing and what that means for NVIDIA vs. cerebras (hint it’s fucking amazing for memory producers):

  • inference is going to be HUGE. 10-50X the value of training but…

  • they’re 2 types of inference: answer inference (90% of today’s) and AGENTIC inference (10%)

  • agentic inference is where most of the value will come from in the future but right now it’s in its infancy… 99% of agents act like chatbots for now

  • but when that flips, agents will require a different type of AI chip, one that doesn’t look like nvidia’s GPU.

  • it’ll need higher bandwidth memory, an abundance of compute and low latency. agents WON’T be constrained by humans

  • memory is the most important. very bullish memory providers.

  • Cerebras is only good for answer inference, it’s not good for agentic.

  • answer inference is not as valuable. it’ll have its own niche but much smaller TAM vs. the entire inference market.

  • NVIDIA will maintain the throne for the foreseeable future

Stratechery (@stratechery)

The Inference Shift

Agentic inference is going to be different than the inference we use today, and it will change compute infrastructure because speed won't matter when humans aren't involved.

stratechery.com/2026/the-inf…

— https://nitter.net/stratechery/status/2053777140114444603#m