a language model pointing out serious flaws in a math bench is a little jarring.
the models are about to become much better.
we’re in for quite the ride in the next months.
a language model pointing out serious flaws in a math bench is a little jarring.
the models are about to become much better.
we’re in for quite the ride in the next months.