VeriGen: A Large Language Model for Verilog Code Generation

Thakur, Shailja; Ahmad, Baleegh; Pearce, Hammond; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh; Garg, Siddharth

doi:10.1145/3643681

articleACM Transactions on Design Automation of Electronic SystemsFeb 9, 2024Closed access

VeriGen: A Large Language Model for Verilog Code Generation

STShailja Thakur BABaleegh Ahmad HPHammond Pearce BTBenjamin Tan BDBrendan Dolan-Gavitt

New York University · UNSW Sydney · +1 more institution

Indexed incrossref

Abstract

In this study, we explore the capability of Large Language Models (LLMs) to automate hardware design by automatically completing partial Verilog code, a common language for designing and modeling digital systems. We fine-tune pre-existing LLMs on Verilog datasets compiled from GitHub and Verilog textbooks. We evaluate the functional correctness of the generated Verilog code using a specially designed test suite, featuring a custom problem set and testing benches. Here, our fine-tuned open-source CodeGen-16B model outperforms the commercial state-of-the-art GPT-3.5-turbo model with a 1.1% overall increase. Upon testing with a more diverse and complex problem set, we find that the fine-tuned model shows…

Citation impact

165

total citations

FWCI: 51.16
Percentile: 100%
References: 18

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Verilog
Correctness
Programming language
Set (abstract data type)
Scripting language
Code (set theory)
Hardware description language

No related works found for this paper.