No body text on file.
Open the original to read the full piece.
Epoch AI's analysis finds that AI capabilities have recently accelerated on three of four measured metrics—ECI, the log of METR's 50% time horizon, and a composite Math Index—linked temporally to the rollout of "reasoning" models in 2024. The best single explanatory fit for these three metrics was a reasoning/non‑reasoning split: reasoning models show a discrete level jump plus roughly 2–3× steeper linear progress versus non‑reasoning models, although several other superlinear fits also perform well. The Combined Math Index was produced via a 2PL IRT fit across multiple math benchmarks (ridge penalties λa=0.378, λb=0.0001) and rescaled to map Sonnet 3.5→130 and GPT‑5→150. The study fit eight parametric curves by unweighted least squares and evaluated them with expanding‑window cross‑validation (primary horizon = 6 months), plus leave‑one‑out/contiguous‑block perturbation checks to assess stability. WeirdML V2 showed no acceleration, possibly because of task constraints and limited pre‑2024 data; authors caution the acceleration may be concentrated in domains—programming and math—where correctness is easily auto‑verified and RL has been heavily applied.
Three of four metrics—Epoch Capabilities Index (ECI), log METR 50% time horizon, and a combined Math Index—show clear acceleration coincident with the emergence of "reasoning" models in 2024; reasoning models exhibit a one‑off performance jump and an approximately 2–3× faster linear trend versus non‑reasoning models.
Open the original to read the full piece.