Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Soochow University · Beijing Jiaotong University · +3 more institutions
Abstract
Recently, the emergence of ChatGPT has attracted wide attention from the computational linguistics community. Many prior studies have shown that ChatGPT achieves remarkable performance on various NLP tasks in terms of automatic evaluation metrics. However, the ability of ChatGPT to serve as an evaluation metric is still underexplored. Considering assessing the quality of natural language generation (NLG) models is an arduous task and NLG metrics notoriously show their poor correlation with human judgments, we wonder whether ChatGPT is a good NLG evaluation metric. In this report, we provide a preliminary meta-evaluation on ChatGPT to show its reliability as an NLG metric. In detail, we regard ChatGPT as a…
Citation impact
- FWCI
- 35.87
- Percentile
- 100%
- References
- 42
Authors
9Topics & keywords
- Automatic summarization
- Natural language generation
- Metric (unit)
- Computer science
- Relevance (law)
- Task (project management)
- Artificial intelligence
- Natural language processing
- Quality Education