Epoch AI

Epoch AI 2025 impact report

Brief

Epoch AI’s 2025 impact report positions the organization as a data-and-analysis layer for understanding frontier AI scaling, especially where model capability intersects with compute, infrastructure, and economic implications. Its most concrete contributions were new datasets on GPU clusters and frontier data centers, where it uses satellite and permitting data to track construction timelines, power requirements, and likely compute build-out. That focus is especially notable given the report’s framing that AI companies are already generating annual revenues in the tens of billions of dollars while building individual data centers with similarly large price tags. On the model-evaluation side, Epoch argues that single benchmarks are increasingly saturated, so it introduced the Epoch Capabilities Index, aggregating results from dozens of benchmarks to create a more stable cross-model capability measure.

The report also highlights benchmark creation and macro modeling. FrontierMath Tier 4, commissioned by OpenAI, is a research-level benchmark designed with mathematicians to resist shortcut exploitation; only 17 of 48 private questions had been solved across all models by January 2026. GATE extends Epoch’s work from capability tracking into economic forecasting, modeling feedback loops between AI investment, automation, and productivity. Institutionally, Epoch has become more visible and financially substantial: it spun out as an independent 501(c)(3), spent $5 million in 2025, employed 21 full-time staff, and undertook commissioned work for OpenAI, Google DeepMind, xAI, EPRI, ARIA, and policy bodies such as the UK AI Security Institute and EU AI Office. Its 2026 roadmap leans further into AI infrastructure, supply chains, energy demand, and benchmarking coverage.

Why it matters

Epoch AI’s 2025 work centered on tracking AI infrastructure and capability growth, including a GPU Clusters Data Explorer and a Frontier Data Centers Data Explorer that use satellite imagery and permit data to estimate compute capacity, power use, and construction timelines for frontier AI facilities.

Key details

  • In October 2025, Epoch launched the Epoch Capabilities Index (ECI), a composite frontier-model metric built from at least 4 benchmark scores per model and drawing on more than 3 dozen benchmarks; the method emerged from its “Rosetta Stone” collaboration with Google DeepMind and was used to identify a possible acceleration in AI capabilities around April 2024.
  • Epoch completed FrontierMath Tier 4 for OpenAI: 50 research-level math problems, including 2 public questions and a 20-question private holdout, produced with university mathematicians; as of January 2026, models had solved only 17 of the 48 private questions, indicating the benchmark remains far from saturated.
  • The organization released GATE, a macroeconomic model of AI-driven automation in which investment in AI hardware and R&D feeds back into productivity and further automation; in some scenarios, the model shows more than 20% of annual economic output being reinvested into AI.
  • Epoch reported 2025 output of 4 new Data Explorers, 38 Data Insights, 40 Gradient Updates, and 14 reports/papers, alongside 987,000 active website users, 10,300 newsletter subscribers, 31,000 Twitter followers, $5 million spent, 21 full-time staff, and 13 commissioned research or consulting engagements.
  • For 2026, Epoch plans to expand into upstream AI supply-chain tracking such as fabs and semiconductor tools, build new benchmarks for automatically verifiable open math problems and long-horizon software engineering tasks with METR, and investigate topics including compute efficiency, AI enterprise adoption, training-data constraints, robotics, and AI power demand.
Cleaned source text

title: Epoch AI 2025 impact report

author: The Epoch Ai Team

content_type: article

publication: Epoch AI

published: 2025-04-01T00:00:00

source_url: https://epoch.ai/blog/epoch-impact-report-2025

word_count: 3315

In 2025, we saw AI continue to increase in scale and importance. AI companies reached annual revenues totalling tens of billions of dollars, and are building data centers that individually cost comparable amounts. Leading benchmarks show capabilities accelerating, propped up by the establishment of reasoning models, such as OpenAI’s oN model series. And we have seen an incredible diffusion of capabilities, with Chinese open weight models such as DeepSeek R1 closing in the gap with US frontier models released only months before.

Epoch AI has responded with new and expanded initiatives to advance its mission of sharing up-to-date information about – and making sense of – the trajectory of AI. We are excited to share a recap of our work in 2025, and our plans for 2026.

We are raising $3 million to execute a more ambitious version of our plans. Donations can be made directly through our website. For those considering a substantial contribution, or commissioning a project, please contact us at [[email protected]](/cdn-cgi/l/email-protection#9ffbf0f1feebfadffaeff0fcf7b1fef6).

Highlights from 2025

AI data centers & compute clusters

AI infrastructure became a major focus of investment and public attention in 2025. We pursued two related initiatives, starting with the creation of the GPU Clusters Data Explorer (originally called AI Supercomputers), followed by the ongoing build-out of the Frontier Data Centers Data Explorer, using satellite and permit data to track compute, power use, and construction timelines.

The Benchmarking Hub & the Epoch Capabilities Index (ECI)

Early in 2025, we launched a revamped version of our Benchmarking hub. Its landing page was our most visited page in 2025. Focused on top AI models, this page gathers evaluations reported by developers and third parties, as well as those run by Epoch.

As individual benchmarks saturate, it has become harder to compare frontier models using any single score. To address this, we introduced the Epoch Capabilities Index (ECI) in October, a composite metric that aggregates performance across multiple benchmarks to provide a more stable measure of model capability. The ECI combines at least four benchmark scores per model, drawing from over three dozen benchmarks in total, and performs well as a predictor of benchmark performance. This approach was developed as part of the “Rosetta Stone” collaboration with researchers from Google DeepMind.

Why this matters: Benchmark evaluations are one of the most straightforward – yet also ephemeral – ways to measure improvements in AI capabilities. Our Capabilities Index highlights the broader trend of improvements. For example, the ECI helped us identify a potential acceleration in AI capabilities near April 2024.

FrontierMath Tier 4

We completed and delivered FrontierMath Tier 4, a new tier of difficulty for our math benchmark, commissioned by OpenAI. Tier 4 consists of a collection of 50 research-level problems, including 2 public problems and a 20 question private holdout set. The problems were crafted by a team including world-leading university mathematics professors and postgraduate researchers, designed to test deep mathematical reasoning.

Most Tier 4 problems were designed or improved in a symposium attended by leading mathematicians, where problems were tested and approved by a panel of experts. Compared to FrontierMath Tiers 1-3, this has resulted in problems that are more difficult and harder to game with shortcuts, improving our ability to recognize genuinely strong mathematical reasoning that would impress a professional mathematician.

Why this matters: FrontierMath Tier 4’s focus on research-level problems allows us to track the capacity of new AI models to contribute to mathematical research. It is a benchmark that has remained largely unsaturated, with only 17 out of the 48 private questions solved across all models as of January 2026.

Growth and AI Transition Endogenous (GATE) model

We released GATE, a framework for exploring how AI automation can affect the entire economy. GATE is a macroeconomic model that describes how investment in AI hardware and R&D could lead to increased automation and productivity which enables further investments in automation. Our model illustrates how we could see explosive growth from AI, with over a fifth of the economy’s yearly output reinvested into AI, even under conditions of uncertainty about the total degree of automatability.

Why this matters: GATE is the most complete macroeconomic model we are aware of for the effects of AI automation. Further work would involve calibrating the model with real-world data, and fine-tuning the equations and parameters to produce realistic predictions of the AI trajectory. As a framework, it is already enabling economists and researchers to understand key dynamics of AI development.

Data Insights & Gradient Updates

During 2025, we responded to the increased tempo of AI developments by publishing shorter material on a weekly cadence through two distinct formats.

Our Data Insights are short, authoritative data investigations centered around a key graph and takeaway, meant to be accessible and citable sources for important AI trends. Popular Data Insights in 2025 include our analysis of inference price efficiency, of AI accessible on consumer hardware, and OpenAI’s allocation of compute between inference and development.

Our Gradient Updates newsletter offers leading-edge commentary and (when appropriate) speculative forays into important AI topics by individual authors, including some guest posts. Topics that we covered include a breakdown of the innovations introduced by DeepSeek v3, the energy costs of ChatGPT, and an analysis of how far reasoning models could scale.

Why this matters: Our audience is busy, and the AI industry moves fast. To help readers interpret our data efficiently, we publish concise analyses that highlight key insights and context, with timely commentary. These shorter formats complement our databases and longer reports.

AI in 2030

In a report commissioned by Google DeepMind, we extrapolated existing trends in scaling compute, power, and data for training to understand the required inputs to maintain the current trend of progress to 2030. We also examined potential bottlenecks to scaling, and in each case found that they are likely to be surmountable.

We then extrapolated how this would affect performance in four domains: software engineering, mathematics, molecular biology, and weather prediction.

Why this matters: This report presents a core feature of how we currently understand the trajectory of AI: exponentially larger investments and inputs to development can lead to large advances in performance. This analysis then supports concrete extrapolations and insight into how such improved AI capabilities can affect science.

Epoch AI by the numbers

Outputs

New Data Explorers

New Data Insights

Gradient Updates (newsletter) issues

Reports & Papers

Podcast episodes released

Other interviews and explainer videos

Reach

Notable mentions in media & reports

987,000

Active website users

6,700

Unique domains linked to the site

10,300

Current newsletter subscribers

7,500

LinkedIn subscribers

31,000

Twitter followers

Finances and organization

$5 million

Spent

Full-time staff

Commissioned research projects and consultations

Press and citations

Our work has been extensively referenced by decision-makers and covered by the media. Below is a short selection of notable mentions, illustrating our research’s influence in the AI discourse.

All chips in! Would a fall in AI-related asset valuations have financial stability consequences?

Regulating Artificial Intelligence: U.S. and International Approaches and Considerations for Congress

Paid engagements

We entered several paid engagements with organizations from government, the AI industry, and investors to help them understand the future of AI and to complement our funding from donations.

On the benchmarking front, we released the aforementioned FrontierMath Tier 4, commissioned by OpenAI. We also started a collaboration funded by METR to develop a long-horizon software engineering benchmark. On model evaluations, xAI and Google DeepMind commissioned in-depth evaluations of the math capabilities of Grok 4 and Gemini Deep Think.

Google Deepmind also commissioned the AI in 2030 research report, as well as our co-authored “Rosetta Stone” paper. In the latter, we developed a method to aggregate and “translate” across benchmark scores for models evaluated on different subsets of benchmarks over multiple years. This collaboration helped us independently launch our Epoch Capabilities Index (ECI).

Other paid engagements include a report on the power demand for AI training, commissioned by the energy nonprofit EPRI, data insights commissioned by the AI Index and the Advanced Research and Invention Agency (ARIA), and consultations with UK AI Security Institute, EU AI Office, Sequoia Capital, and Bridgewater Associates. You can learn more about our consultancy services here.

Events and other engagements

Through the year we participated in a number of events and unpaid engagements to disseminate our research and inform the public about AI.

We gave briefings to U.S. congresspeople at the Aspen Institute Congressional AI Conference, to the UK AI directorate, and to Capitol Hill staffers, and we delivered talks for institutions such as Schmidt Sciences and the Institute for Progress.

We also participated in events including the EPRI annual conference on energy utilities, the Robotics: Science and Systems (RSS) conference, UK DSIT’s first AI Energy Council, the Global Hive Datacenter Summit, The Curve, and OpenAI’s Economic Research Conference.

Additionally, our staff were guests on prominent podcasts, including a16z and Dwarkesh, and we regularly engage with and are interviewed by journalists. Reach out to [[email protected]](/cdn-cgi/l/email-protection#442921202d250421342b272c6a252d) to arrange an interview, briefing, or talk.

Testimonials from our audience

We’re in regular dialogue with key stakeholders who benefit from our work. In our 2025 impact survey, we asked them to share about how our research has supported their work in the past year. Here are some of their responses.

A key crux for CG’s AI grantmaking strategy is AI timelines. As we’ve worked on an updated timelines analysis to inform that crux, Epoch reports have been a frequent source of parameter value estimates.

Managing Director, AI Governance & Policy, Coefficient Giving

Epoch is highly trusted by all camps. At least in DC federal policy circles, I’ve heard people say ‘I like what Epoch wrote’… You’re not seen as having ‘motivated reasoning’ in the same way that the heavily-silo-ed AI discourse is.

Research Director, Golden Gate Institute for AI

We used the Epoch Capabilities Index to determine what the recent rate of software progress has been, which is input as a parameter into our upcoming timelines+takeoff model.

Researcher, AI Futures Project