title: @randallbalestr: We start having provable measures of alignment between pretraining setups and eval perfs:
- CV arxiv...
author: @randallbalestr
contenttype: tweet
publication: Twitter/X
published: 2026-01-01T15:53:42+00:00
sourceurl: https://x.com/randall_balestr/status/2006755721862590623
word_count: 82
We start having provable measures of alignment between pretraining setups and eval perfs:
- CV arxiv.org/abs/2402.11337
- CV + noise: arxiv.org/abs/2505.12477
- MAE arxiv.org/abs/2508.15404
- JEPA arxiv.org/abs/2205.11508
- LLM/NLP: arxiv.org/abs/2505.17169
Very early but promising!
Haider. (@slow_developer)
Mathematician Terence Tao:
Training and running LLMs isn't mathematically difficult; any math undergrad could understand the basics
The mystery is that we have no theory to predict why models excel at certain tasks and fail at others
"we can only make empirical experiments"
Video
— https://nitter.net/slow_developer/status/2006364731037139092#m