articleJan 1, 2023GOLD OA

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Indexed incrossref

Abstract

Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations. First, they often adopt a specific architecture (encoder-only or decoder-only) or rely on a unified encoder-decoder network for different downstream tasks, lacking the flexibility to operate in the optimal architecture for a specific task. Secondly, they often employ a limited set of pretraining objectives which might not be relevant to some tasks and hence result in substantial performance degrade. To address these limitations, we propose “CodeT5+”, a family of encoder-decoder LLMs for code in which component modules can be flexibly combined…

Citation impact

343
total citations
FWCI
56.83
Percentile
100%
References
0
Citations per year

Authors

6

Topics & keywords

Keywords
  • Computer science
  • Code (set theory)
  • Flexibility (engineering)
  • Encoder
  • Language model
  • Code generation
  • Task (project management)
  • Set (abstract data type)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.