🚨 BREAKING: AISI tested a newer Mythos Preview checkpoint.
These numbers are insane:
> solved "The Last Ones", a 20hr task, in 6/10 attempts
> solved the previously unsolved "Cooling Tower" task in 3/10 attempts
> cyber task time horizons went from doubling every 8mos to every ~4.7mos
> Mythos and GPT-5.5 both exceeded that trend
This could be a super-exponential if the trend holds.
Ignore the people who say: "It's just better at hacking."
It looks like long-horizon autonomy crossing a threshold.
Cyber just happens to be the domain where Anthropic allowed organizations to test it first.
There is no obvious reason to assume this stays confined to cyber.
We are in the takeoff. You can feel it.
AI Security Institute (@AISecurityInst)
Mythos Preview also solved "Cooling Tower", our industrial control system range, in 3 of 10 attempts.
— https://nitter.net/AISecurityInst/status/2054589766490825081#m