UniXcoder: Unified Cross-Modal Pre-training for Code Representation
Sun Yat-sen University · Microsoft Research Asia (China)
Abstract
Pre-trained models for programming languages have recently demonstrated great success on code intelligence. To support both code-related understanding and generation tasks, recent works attempt to pre-train unified encoder-decoder models. However, such encoder-decoder framework is sub-optimal for auto-regressive tasks, especially code completion that requires a decoder-only manner for efficient inference. In this paper, we present UniXcoder, a unified cross-modal pre-trained model for programming language. The model utilizes mask attention matrices with prefix adapters to control the behavior of the model and leverages cross-modal contents like AST and code comment to enhance code representation. To encode AST…
Citation impact
- FWCI
- 73.10
- Percentile
- 100%
- References
- 28
Authors
6Topics & keywords
- Computer science
- Code generation
- Code (set theory)
- Representation (politics)
- Encoder
- ENCODE
- Programming language
- Encoding (memory)
- Quality Education