Epoch AI

Announcing our expanded biology AI coverage

Brief

Epoch AI's expanded Biology AI Dataset now catalogs over 360 biological ML models (announced 2025-01-29), with curated metadata on developers, intended tasks, training datasets and newly reported estimates of training compute. The team prioritized recent frontier models and produced a visualization tracking training-compute and dataset-size evolution, revealing substantial scaling from 2017–2021 and a relative slowdown thereafter. The update also inventories developer-adopted safeguards—data filtering, risk evaluations, inference-time refusal and access controls—but finds fewer than 3% of models list any safeguards, with larger foundation models (e.g., EvolutionaryScale’s ESM 3 and AlphaFold 3) more likely to have protections. Sentinel Bio funded the effort; Epoch AI owns the dataset and distributes it under CC-BY. Detailed safeguards records are available to vetted researchers on request via safeguards@epoch.ai. Authors: Pablo Villalobos and David Atanasov.

Why it matters

Epoch AI expanded its Biology AI Dataset to cover 360+ models (announcement published 2025-01-29), adding developer, intended-task, training-dataset metadata and new estimates of training compute for each model.

Key details

  • Their analysis finds a rapid increase in training compute and dataset sizes from 2017–2021, followed by a relative slowdown in biological model development after 2021, as shown in Epoch AI’s compute/dataset-size visualization.
  • Fewer than 3% of models in the dataset have documented safeguards (examples: data filtering, risk evaluations, inference-time refusals, access controls); the most-capable models such as EvolutionaryScale’s ESM 3 and AlphaFold 3 tend to have more safeguards.
  • Sentinel Bio funded the data-collection project; Epoch AI owns and publishes the dataset under a Creative Commons Attribution license, and detailed safeguards data is available on request (email safeguards@epoch.ai).
Reader · no content

No body text on file.

Open the original to read the full piece.