The Deep Research problem

title: The Deep Research problem

author: Benedict Evans

content_type: article

publication: Essays - Benedict Evans

published: 2025-02-18T14:51:22

source_url: https://www.ben-evans.com/benedictevans/2025/2/17/the-deep-research-problem

word_count: 1666

Most what I do for a living is research and analysis. I think of data I’d like to see and go looking for it; I compile and collate it, make charts, decide they’re boring and try again, find new ways and new data to understand and explain the issue, and produce text and charts that try to express what I’m thinking. Then I go and talk to people about it. This often involves a huge amount of manual labour - there’s an iceberg beneath each chart - and OpenAI’s Deep Research looks like it should be tailor-made for me. So, does it fit? I could test it myself with a new problem, but before I burn time and credits, as luck would have it OpenAI’s own product page has a sample report on something I know quite a lot about - smartphones. Let’s have a look.

View fullsize

This table looks great - hours of work compiling this data all done for for me by a machine. Before we give it to a client, though, let’s just check a few things. First, what’s the source? Ah. We have two sources: Statista and Statcounter. Statcounter is a problematic measure of ‘adoption’ - it’s a measure of traffic , and as we all know, different devices are used differently, higher-end devices are used more, and the iPhone skews to the high-end and also skews to more use. You can’t really use that for this, as I’d explain to an intern (I often compare AI to interns). Statista, meanwhile, aggregates other people’s data, makes sure it ranks highly in SEO, and then tries to get you to register or pay to see the result. I think Google should ban this company from the index, but even if you disagree, saying this is the source is like saying the source is ‘a Google search result’. Again, this is an intern-level issue. Setting that aside, though, let’s dig some more, and look at one number - Japan. Deep Research says that the Japanese smartphone market is split 69% iOS and 31% Android. That prompts two questions: is that what those sources say, and are they right? These are very different kinds of question. First, Statcounter, despite over-weighting iPhones as noted above, doesn’t actually say 69%, or at any rate hasn’t in over a year. Hmm.

If we check Statista, we have to jump through a bunch of hoops, but eventually find that the actual source is the research firm Kantar Worldpanel, and the numbers it gives are pretty much the exact opposite of what Deep Research claims - 63% Android and 36% iOS. Oh.

The Deep Research problem

Brief

Why it matters

Key details