articleJan 1, 2023GOLD OA

Large Language Models Can Self-Improve

University of Illinois Urbana-Champaign · Google (United States)

Indexed incrossref

Abstract

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate “high-confidence” rationale-augmented answers for unlabeled questions using Chain-of-Though (CoT) prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that without any ground truth label, our approach improves the general reasoning ability of a 540B-parameter…

Citation impact

186
total citations
FWCI
30.77
Percentile
100%
References
60
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Consistency (knowledge bases)
  • Ground truth
  • Language model
  • Artificial intelligence
  • Resource (disambiguation)
  • Work (physics)
  • Machine learning
No related works found for this paper.