No body text on file.
Open the original to read the full piece.
AI Value Capture - The Shift To Model Labs focuses on how agentic AI, hardware advances, and software optimizations have recomposed profit pools so that frontier model labs (e.g., Anthropic) are now capturing the lion’s share of value. The brief documents a rapid structural change beginning December 2025: agentic workflows (code agents, multi-turn assistants) drive much higher token utility and demand, increasing model monetization while the cost of producing tokens has plunged. SemiAnalysis quantifies this with internal examples—annual Anthropic token runs up to $10.95M, per-employee consumption near 5B tokens/month, Claude Code input:output ratios ≈300:1 and cache hit rates >90%—and links those usage patterns to Anthropic ARR growth (reported from $9B to >$44B) and gross margin expansion (<40% → >70%).
The piece documents the technical levers behind the shift. New accelerators (Blackwell/GB300/VR NVL72) and ASICs (TPUv7, Trainium3) plus middleware improvements (wideEP, disaggregation, MTP) multiply tokens/sec/gpu: software alone can produce ~14x gains on B300; combined with hardware, SemiAnalysis reports GB300 NVL72 configurations delivering ~17x higher throughput vs H100 in FP8 and up to ~32x in FP4. Meanwhile memory scarcity and pricing have surged (DRAM up ~6x year-over-year, one-year H100 rental +40% since Oct 2025), making Nvidia’s socketed SOCAMM a crucial pricing lever—SemiAnalysis models SOCAMM at ~$8/GB in 1Q26 with potential to exceed $13/GB by end-2026 and assumes ~$10/GB as a working figure. System-level economics are surprising: capex per watt barely changes from GB300 ($37.4/W) to Rubin (VR NVL72 $38.1/W) despite large TDP and FLOP increases, leaving big room between cost-based rental floors (~$4.92/hr/GPU for a 15.6% IRR) and value-based ceilings (~$12.25/hr/GPU parity, conservative ~$9.63/hr). The authors present a ‘One Chart To Rule Them All’ that combines floor and ceiling constraints with Neocloud IRR curves to illustrate how incremental pricing by Nvidia or TSMC could shift captured value. Finally, SemiAnalysis argues that although TSMC and Nvidia could extract more given N3 and memory tightness (TSMC N3 >100% utilization expected H2 2026; DRAM fabs >90%), both firms have so far held pricing in part for strategic/regulatory reasons—meaning short-term value accrues disproportionately to model labs, inference providers, neoclouds and memory vendors unless system suppliers move to value-based pricing.
Agentic AI crossed a practical inflection in December 2025, driving token value up and consumption higher—SemiAnalysis reports Anthropic ARR rising from $9B to over $44B year-to-date and inference gross margins expanding from <40% to >70%.
Open the original to read the full piece.