articleJan 1, 2023GOLD OA

Is ChatGPT a Good NLG Evaluator? A Preliminary Study

Soochow University · Beijing Jiaotong University · +3 more institutions

Indexed incrossref

Abstract

Recently, the emergence of ChatGPT has attracted wide attention from the computational linguistics community. Many prior studies have shown that ChatGPT achieves remarkable performance on various NLP tasks in terms of automatic evaluation metrics. However, the ability of ChatGPT to serve as an evaluation metric is still underexplored. Considering assessing the quality of natural language generation (NLG) models is an arduous task and NLG metrics notoriously show their poor correlation with human judgments, we wonder whether ChatGPT is a good NLG evaluation metric. In this report, we provide a preliminary meta-evaluation on ChatGPT to show its reliability as an NLG metric. In detail, we regard ChatGPT as a…

Citation impact

217
total citations
FWCI
35.87
Percentile
100%
References
42
Citations per year

Authors

9

Topics & keywords

Keywords
  • Automatic summarization
  • Natural language generation
  • Metric (unit)
  • Computer science
  • Relevance (law)
  • Task (project management)
  • Artificial intelligence
  • Natural language processing
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.