ArXiv

When Can Digital Personas Reliably Approximate Human Survey Findings?

2026-05-11 · 14:41 UTC ·Mumin Jia, Yilin Chen, Divya Sharma... ·1 min read

Authors: Mumin Jia, Yilin Chen, Divya Sharma...
Categories: cs.CL, cs.AI, cs.SI, stat.ML
arXiv: https://arxiv.org/abs/2605.10659v1
PDF: https://arxiv.org/pdf/2605.10659v1

Brief

LLM-based digital personas were evaluated as substitutes for human survey respondents using the LISS panel: personas were built from background variables and pre-2023 survey histories and tested on held-out post-cutoff answers. Across four persona architectures, three LLMs, and two prediction tasks, personas improved distributional alignment (notably for stable attributes) but struggled with individual prediction and multivariate respondent structure; retrieval augmentation helped. Summary is based on the abstract (full paper not reviewed).

Why it matters

Using the LISS panel, authors constructed digital personas from respondents' background variables and pre-2023 survey histories and tested them against the same respondents' held-out post-cutoff answers across four persona architectures, three LLMs, and two prediction tasks.

Key details

Personas improved alignment with human response distributions—especially for questions tied to stable attributes and values—and retrieval-augmented architectures produced the clearest gains; however personas performed poorly at individual-level prediction, failed to recover multivariate respondent structure, and did worst on subjective, heterogeneous, or rare responses.

Source evidence

Abstract

Digital personas powered by Large Language Models (LLMs) are increasingly proposed as substitutes for human survey respondents, yet it remains unclear when they can reliably approximate human survey findings. We answer this question using the LISS panel, constructing personas from respondents' background variables and pre-2023 survey histories, then testing them against the same respondents' held-out post-cutoff answers. Across four persona architectures, three LLMs, and two prediction tasks, we assess performance at the question, respondent, distributional, equity, and clustering levels. Digital personas improve alignment with human response distributions, especially in domains tied to stable attributes and values, but remain limited for individual prediction and fail to recover multivariate respondent structure. Retrieval-augmented architectures provide the clearest gains, but performance depends more on human response structure than on model choice: personas perform best for low-variability questions and common respondent patterns, and worst for subjective, heterogeneous, or rare responses. Our results provide practical guidance on when digital personas could be appropriate for survey research and when human validation remains necessary.