No body text on file.
Open the original to read the full piece.
Tuhin Srivastava, CEO of Baseten, framed the conversation around the current “inference crunch”: rapid customer adoption of custom models, strained global GPU capacity, and the operational work required to run inference at scale. He began by noting Baseten’s recent hypergrowth—~30x year‑over‑year and a forecast above $1B revenue—and argued that most real‑world production workloads today are custom or post‑trained models (he estimated >95% of tokens). Srivastava used companies such as Abridge and Open Evidence to illustrate why an independent application layer remains viable: firms with unique user signals and workflow integrations will continually fine‑tune models and create differentiated value that labs alone cannot replicate.
The middle of the interview focused on infrastructure realities. Baseten runs 90 clusters across 18 clouds and operates at mid‑90s utilization, which Srivastava says exposes a hard supply shortage and a thin set of reliable suppliers. He described current commercial terms for large purchases—3–5 year contracts with ~20–30% prepaid TCV—and said this dynamic changes financing strategy (e.g., earlier IPOs, debt or other structures). Srivastava also explained why Baseten bought a post‑training research team: post‑training, quantization and inference are tightly coupled and customers demand both performance and continual retraining loops. On hardware, he argued Nvidia (H100) still dominates because of supply chain and CUDA ecosystem advantages, but he expects specialized inference and decoder chips to emerge over time. Operational scale issues—kernel panics, logging overloads, immature LLM runtimes—underscore his view that winning requires excellent software, ops culture (24/7 readiness) and partnerships around evals, sandboxes and training APIs. He ended optimistic: lower inference costs will drive more agentic, long‑horizon workflows and a proliferation of personalized “concierge” agents, increasing overall demand rather than saturating it.
Baseten grew ~30x over the prior year and Tuhin Srivastava said the company is expecting to exceed $1 billion in revenue in 2026.
Open the original to read the full piece.