No body text on file.
Open the original to read the full piece.
Epoch AI's expanded Biology AI Dataset now catalogs over 360 biological ML models (announced 2025-01-29), with curated metadata on developers, intended tasks, training datasets and newly reported estimates of training compute. The team prioritized recent frontier models and produced a visualization tracking training-compute and dataset-size evolution, revealing substantial scaling from 2017–2021 and a relative slowdown thereafter. The update also inventories developer-adopted safeguards—data filtering, risk evaluations, inference-time refusal and access controls—but finds fewer than 3% of models list any safeguards, with larger foundation models (e.g., EvolutionaryScale’s ESM 3 and AlphaFold 3) more likely to have protections. Sentinel Bio funded the effort; Epoch AI owns the dataset and distributes it under CC-BY. Detailed safeguards records are available to vetted researchers on request via safeguards@epoch.ai. Authors: Pablo Villalobos and David Atanasov.
Epoch AI expanded its Biology AI Dataset to cover 360+ models (announcement published 2025-01-29), adding developer, intended-task, training-dataset metadata and new estimates of training compute for each model.
Open the original to read the full piece.