substack.com

AI Outlook – Nvidia vs Broadcom vs AMD

2026-03-29 · 10:15 UTC ·Tech Investments ·12 min read

Brief

Token demand and inference latency are reshaping the AI infrastructure market. The newsletter highlights explosive token growth — OpenAI shows a 320x YoY increase in reasoning tokens and Goldman reports China's enterprise token use rose 162% from June to December — while OpenRouter data indicates an acceleration since January 2026. UBS cites a 99.7% drop in token cost over three years but warns demand growth still makes compute capacity the limiting factor; inference latency has emerged as the next key bottleneck for agentic and coding workflows.

Nvidia's response is a disaggregated inference stack: Vera Rubin GPUs handle memory‑heavy prefill/KV cache work while Groq LPX (SRAM‑centric, low‑latency decode) accelerates token generation. Jensen Huang claims ~35x token‑generation performance improvement when fused via Dynamo; sampling is underway, Samsung will fabricate Groq LPX, and shipments are targeted around Q3 2026. UBS estimates LPX could add ~ $50B to C2027 data‑center revenue, supporting upside scenarios where Nvidia data‑center revenues approach ~$600B in C2027. Countervailing forces include hyperscaler ASIC programs (Meta’s MTIA roadmap and Broadcom as a key partner). JP Morgan projects Broadcom AI revenues of $65B+ in FY26 and $120B+ in FY27 with ~10 GW deployed, implying significant hyperscaler share gains. The authors model a 10x expansion in AI demand by 2031 as plausible and note humanoid robotics components as an additional long‑run growth vector, while warning that in‑house ASICs pose a medium‑term risk to Nvidia's share despite overall market expansion.

Why it matters

OpenAI enterprise data shows API token consumption for reasoning rose 320x year‑over‑year; Goldman reports China enterprise token usage grew 162% from June to December (reported Mar 29, 2026).

Key details

UBS cited a 99.7% decline in cost-per-token over three years and warned that compute capacity — not just gigawatts — is the binding constraint as workloads diversify and latency becomes the next bottleneck.
Nvidia is integrating the Groq team/LPX (SRAM‑heavy) with its Vera Rubin GPUs and Dynamo software to disaggregate prefill vs. decode: management claims a ~35x increase in token‑generation performance and expects Groq LPX in production and shipping around Q3 2026 (Samsung manufacturing).
UBS models LPX could add ≈$50B to data‑center revenue on top of a ~$460–470B baseline for C2027 and suggests total Nvidia data‑center revenue could push toward ~$600B in C2027 (UBS baseline ≈$520B).
Broadcom is scaling customer ASICs (Meta MTIA roadmap: MTIA300 in production; MTIA400 in 2026; MTIA450/500 in 2027) and JP Morgan forecasts Broadcom AI (ASIC + networking) >$65B in FY26 and ~$120B+ in FY27, with ~10 GW of XPU deployments and an implied ~27% hyperscaler share next year.
Authors’ scenario: current installed Nvidia AI capacity ≈$407B; if that equals 10% of 2031 demand, a ~$4T installed base is needed — with 6‑year equipment life that implies ≈$678B of Nvidia equipment sales per year from 2026–2031.

Reader · no content

No body text on file.

Open the original to read the full piece.