Twitter/X

@antirez: Useless to say this is even truer with standardized benchmarks where benchmaxing is a real temptatio...

Useless to say this is even truer with standardized benchmarks where benchmaxing is a real temptation and actually is part of the game. Yet: benchmarks tell us that GPT 5.5 is generally better than Opus, which is IMHO true. So they have value.