TLDR: A new research paper introduces ‘Chain-of-Descriptions’ (CoDes), a novel framework designed to improve Large Language Models (LLMs) for VHDL code generation and summarization. Existing LLMs struggle with VHDL due to limited training data and difficulty understanding functional equivalence. CoDes addresses this by guiding LLMs through intermediate descriptive steps, leading to significant performance gains. The study highlights that detailed prompts and a multi-step execution strategy are crucial for enhancing LLM capabilities in hardware description language tasks.
Large Language Models (LLMs) have transformed many areas, from natural language processing to image understanding. In the world of Electronic Design Automation (EDA), where complex digital circuits are designed, LLMs hold great promise for tasks like generating and summarizing Register-Transfer Level (RTL) code, especially in languages like VHDL.
However, despite their widespread use in general coding tasks, there’s been a noticeable lack of research and development focused on adapting and evaluating these powerful models specifically for hardware description languages (HDLs) such as VHDL. This gap means that existing code LLMs often underperform when it comes to VHDL, struggling with its unique syntax and the concept of functional equivalence, which is crucial for hardware design.
To address this challenge, researchers at IBM have introduced a novel approach called Chain-of-Descriptions (CoDes). This framework aims to significantly improve how LLMs handle VHDL code generation and summarization. The core idea behind CoDes is to guide the LLM through a series of intermediate descriptive steps. Instead of directly generating code or a summary, the model first creates a detailed plan or explanation, which then informs its final output.
For VHDL code generation, CoDes prompts the LLM to break down a problem statement into a natural language plan, outlining the necessary steps for VHDL synthesis. For summarization, it encourages a step-by-step explanation of the VHDL code, often based on a line-by-line analysis or by leveraging the code’s hierarchical structure (Abstract Syntax Tree). These intermediate steps are then integrated with the original input, providing the LLM with a clearer roadmap to produce more accurate and coherent results.
The researchers evaluated CoDes using two datasets: VHDL-Eval, which includes converted Verilog problems and public VHDL problems, and VHDL-Xform, an in-house dataset designed to test LLMs’ understanding of functionally equivalent code. Their findings were compelling. The CoDes framework led to significant performance improvements across various LLMs, particularly for models like Granite-Code-34b, which showed a substantial boost in its ability to generate correct VHDL code.
Also Read:
- Beyond Text: Integrating Code Structure into Large Language Models
- New Benchmark Reveals How Fine-Tuning Boosts AI’s Code Comprehension
Key Insights from the Study:
The study revealed several important factors contributing to improved performance:
-
Longer Descriptions Help: Providing more detailed problem descriptions to the LLM significantly enhanced its ability to formulate better plans and generate more accurate VHDL code.
-
Multi-Step Execution is Superior: The research compared two execution strategies: Single-Step (where planning and execution happen in one prompt) and Multi-Step (where planning, refinement, and execution are separate). Multi-step execution consistently outperformed the single-step approach for both code generation and summarization, allowing the model to process and refine intermediate outputs more effectively.
-
Functional Equivalence: The VHDL-Xform dataset specifically helped in gauging LLMs’ understanding of functionally equivalent code, a critical aspect often overlooked in general code LLM evaluations.
While the CoDes framework shows great promise, the researchers acknowledge limitations, such as the reliance on the quality of LLM-generated intermediate steps and the simplified nature of current datasets compared to real-world design complexities. Future work will focus on applying CoDes to more complex RTL designs and optimizing processing time.
In conclusion, the Chain-of-Descriptions framework offers a structured and effective methodology for enhancing LLMs in the EDA domain, specifically for VHDL. By guiding models through a series of descriptive steps, this approach helps bridge the performance gap observed in handling hardware description languages, paving the way for more intelligent and automated electronic design tools. You can read the full research paper here.


