Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

Vaithilingam, Priyan; Zhang, Tianyi; Glassman, Elena L.

doi:10.1145/3491101.3519665

articleCHI Conference on Human Factors in Computing Systems Extended AbstractsApr 27, 2022Closed access

Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

PVPriyan Vaithilingam TZTianyi Zhang ELElena L. Glassman

Harvard University Press · Purdue University West Lafayette · +1 more institution

Indexed incrossref

Abstract

Recent advances in Large Language Models (LLM) have made automatic code generation possible for real-world programming tasks in general-purpose programming languages such as Python. However, there are few human studies on the usability of these tools and how they fit the programming workflow. In this work, we conducted a within-subjects user study with 24 participants to understand how programmers use and perceive Copilot, a LLM-based code generation tool. We found that, while Copilot did not necessarily improve the task completion time or success rate, most participants preferred to use Copilot in daily programming tasks, since Copilot often provided a useful starting point and saved the effort of searching…

Citation impact

537

total citations

FWCI: 73.28
Percentile: 100%
References: 30

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Usability
Debugging
Workflow
Task (project management)
Python (programming language)
Software engineering
Human–computer interaction

No related works found for this paper.

Funding

NS
National Science Foundation
Award: 2107391