Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study

Hämäläinen, Perttu; Tavast, Mikke; Kunnari, Anton

doi:10.1145/3544548.3580688

articleApr 19, 2023GOLD OA

Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study

PHPerttu Hämäläinen MTMikke Tavast AKAnton Kunnari

Aalto University · University of Helsinki

Indexed incrossref

Abstract

Collecting data is one of the bottlenecks of Human-Computer Interaction (HCI) research. Motivated by this, we explore the potential of large language models (LLMs) in generating synthetic user research data. We use OpenAI’s GPT-3 model to generate open-ended questionnaire responses about experiencing video games as art, a topic not tractable with traditional computational user models. We test whether synthetic responses can be distinguished from real responses, analyze errors of synthetic data, and investigate content similarities between synthetic and real data. We conclude that GPT-3 can, in this context, yield believable accounts of HCI experiences. Given the low cost and high speed of LLM data generation,…

Citation impact

232

total citations

FWCI: 127.68
Percentile: 100%
References: 50

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Crowdsourcing
Computer science
Synthetic data
Context (archaeology)
Data science
Data modeling
Open data
Human–computer interaction

No related works found for this paper.