It is unethical for a doctor not to consult an AI!
A new study published in Science shows OpenAI's o1 model (not 5.5, but the over 1 year old o1!) outperformed ER physicians at diagnosing patients, identifying the correct or near-correct diagnosis 67% of the time versus 50–55% for doctors, especially in early triage when information is limited.
The model also scored near-perfect on clinical reasoning in structured cases, far ahead of attending physicians.
Again: a model over 1 year old, which is ages in the times of AI.
This is one of the first studies testing an LLM against real, messy ER data rather than curated textbook cases. The performance gap was widest exactly where mistakes are most dangerous, early in the ER process when doctors have incomplete information and are under time pressure.
And the model tested (o1) is already outdated by AI standards, meaning current models are likely even better.
The study only covered short ER encounters, not longer hospitalizations with days of accumulating data. It also didn't test the model on imaging (scans, X-rays), which is central to many real diagnoses. The next step is proving these systems actually improve patient outcomes in practice, not just in controlled comparisons. But i bet the models will also outperform human doctors on such cases.