Large Language Models Can Self-Improve

Huang, Jiaxin; Gu, Shixiang; Hou, Le; Wu, Yuexin; Wang, Xuezhi; Yu, Hongkun; Han, Jiawei

doi:10.18653/v1/2023.emnlp-main.67

articleJan 1, 2023GOLD OA

Large Language Models Can Self-Improve

JHJiaxin Huang SGShixiang Gu LHLe Hou YWYuexin Wu XWXuezhi Wang

University of Illinois Urbana-Champaign · Google (United States)

Indexed incrossref

Abstract

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate “high-confidence” rationale-augmented answers for unlabeled questions using Chain-of-Though (CoT) prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that without any ground truth label, our approach improves the general reasoning ability of a 540B-parameter…

Citation impact

186

total citations

FWCI: 30.77
Percentile: 100%
References: 60

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Consistency (knowledge bases)
Ground truth
Language model
Artificial intelligence
Resource (disambiguation)
Work (physics)
Machine learning

No related works found for this paper.