Epoch AI 2025 impact report

title: Epoch AI 2025 impact report

author: The Epoch Ai Team

content_type: article

publication: Epoch AI

published: 2025-04-01T00:00:00

source_url: https://epoch.ai/blog/epoch-impact-report-2025

word_count: 3315

In 2025, we saw AI continue to increase in scale and importance. AI companies reached annual revenues totalling tens of billions of dollars, and are building data centers that individually cost comparable amounts. Leading benchmarks show capabilities accelerating, propped up by the establishment of reasoning models, such as OpenAI’s oN model series. And we have seen an incredible diffusion of capabilities, with Chinese open weight models such as DeepSeek R1 closing in the gap with US frontier models released only months before.

Epoch AI has responded with new and expanded initiatives to advance its mission of sharing up-to-date information about – and making sense of – the trajectory of AI. We are excited to share a recap of our work in 2025, and our plans for 2026.

We are raising $3 million to execute a more ambitious version of our plans. Donations can be made directly through our website. For those considering a substantial contribution, or commissioning a project, please contact us at [[email protected]](/cdn-cgi/l/email-protection#9ffbf0f1feebfadffaeff0fcf7b1fef6).

Highlights from 2025

AI data centers & compute clusters

AI infrastructure became a major focus of investment and public attention in 2025. We pursued two related initiatives, starting with the creation of the GPU Clusters Data Explorer (originally called AI Supercomputers), followed by the ongoing build-out of the Frontier Data Centers Data Explorer, using satellite and permit data to track compute, power use, and construction timelines.

The Benchmarking Hub & the Epoch Capabilities Index (ECI)

Early in 2025, we launched a revamped version of our Benchmarking hub. Its landing page was our most visited page in 2025. Focused on top AI models, this page gathers evaluations reported by developers and third parties, as well as those run by Epoch.

As individual benchmarks saturate, it has become harder to compare frontier models using any single score. To address this, we introduced the Epoch Capabilities Index (ECI) in October, a composite metric that aggregates performance across multiple benchmarks to provide a more stable measure of model capability. The ECI combines at least four benchmark scores per model, drawing from over three dozen benchmarks in total, and performs well as a predictor of benchmark performance. This approach was developed as part of the “Rosetta Stone” collaboration with researchers from Google DeepMind.

Why this matters: Benchmark evaluations are one of the most straightforward – yet also ephemeral – ways to measure improvements in AI capabilities. Our Capabilities Index highlights the broader trend of improvements. For example, the ECI helped us identify a potential acceleration in AI capabilities near April 2024.

FrontierMath Tier 4

We completed and delivered FrontierMath Tier 4, a new tier of difficulty for our math benchmark, commissioned by OpenAI. Tier 4 consists of a collection of 50 research-level problems, including 2 public problems and a 20 question private holdout set. The problems were crafted by a team including world-leading university mathematics professors and postgraduate researchers, designed to test deep mathematical reasoning.

Most Tier 4 problems were designed or improved in a symposium attended by leading mathematicians, where problems were tested and approved by a panel of experts. Compared to FrontierMath Tiers 1-3, this has resulted in problems that are more difficult and harder to game with shortcuts, improving our ability to recognize genuinely strong mathematical reasoning that would impress a professional mathematician.

Why this matters: FrontierMath Tier 4’s focus on research-level problems allows us to track the capacity of new AI models to contribute to mathematical research. It is a benchmark that has remained largely unsaturated, with only 17 out of the 48 private questions solved across all models as of January 2026.

Growth and AI Transition Endogenous (GATE) model

We released GATE, a framework for exploring how AI automation can affect the entire economy. GATE is a macroeconomic model that describes how investment in AI hardware and R&D could lead to increased automation and productivity which enables further investments in automation. Our model illustrates how we could see explosive growth from AI, with over a fifth of the economy’s yearly output reinvested into AI, even under conditions of uncertainty about the total degree of automatability.

Why this matters: GATE is the most complete macroeconomic model we are aware of for the effects of AI automation. Further work would involve calibrating the model with real-world data, and fine-tuning the equations and parameters to produce realistic predictions of the AI trajectory. As a framework, it is already enabling economists and researchers to understand key dynamics of AI development.

Data Insights & Gradient Updates

During 2025, we responded to the increased tempo of AI developments by publishing shorter material on a weekly cadence through two distinct formats.

Our Data Insights are short, authoritative data investigations centered around a key graph and takeaway, meant to be accessible and citable sources for important AI trends. Popular Data Insights in 2025 include our analysis of inference price efficiency, of AI accessible on consumer hardware, and OpenAI’s allocation of compute between inference and development.

Our Gradient Updates newsletter offers leading-edge commentary and (when appropriate) speculative forays into important AI topics by individual authors, including some guest posts. Topics that we covered include a breakdown of the innovations introduced by DeepSeek v3, the energy costs of ChatGPT, and an analysis of how far reasoning models could scale.

Why this matters: Our audience is busy, and the AI industry moves fast. To help readers interpret our data efficiently, we publish concise analyses that highlight key insights and context, with timely commentary. These shorter formats complement our databases and longer reports.

AI in 2030

In a report commissioned by Google DeepMind, we extrapolated existing trends in scaling compute, power, and data for training to understand the required inputs to maintain the current trend of progress to 2030. We also examined potential bottlenecks to scaling, and in each case found that they are likely to be surmountable.

We then extrapolated how this would affect performance in four domains: software engineering, mathematics, molecular biology, and weather prediction.

Why this matters: This report presents a core feature of how we currently understand the trajectory of AI: exponentially larger investments and inputs to development can lead to large advances in performance. This analysis then supports concrete extrapolations and insight into how such improved AI capabilities can affect science.

Epoch AI by the numbers

Outputs

New Data Explorers

New Data Insights

Gradient Updates (newsletter) issues

Reports & Papers

Podcast episodes released

Epoch AI 2025 impact report

Brief

Why it matters

Key details