TLDR: A new AI framework, Evolution of Kernels (EoK), automates the optimization of RISC-V processor kernels, which are crucial for emerging hardware platforms. Addressing the challenge of limited reference material for RISC-V, EoK learns optimization strategies from the historical development of established kernel libraries. It then guides large language models, augmented with RISC-V-specific context, to generate highly efficient kernels. EoK achieved a median 1.27x speedup, outperforming human experts and existing AI methods on 80 diverse kernel design tasks, demonstrating a significant leap in automated kernel optimization for nascent hardware ecosystems.
The world of computing is constantly evolving, with new hardware platforms like RISC-V emerging to democratize CPU design. RISC-V offers an open, free, and customizable instruction set architecture, allowing organizations to develop custom processors without the high costs associated with proprietary systems like ARM or x86. However, the widespread adoption of RISC-V systems faces a significant hurdle: an immature software ecosystem, particularly a lack of highly-optimized kernels.
Kernels are the foundational software layer that bridges hardware and software, providing essential functionalities such as process management, memory management, and system call handling. The current gap in kernel optimization is a major impediment to RISC-V’s broader deployment. Developing efficient RISC-V kernels is a challenging, largely manual process, often relying on human experts using trial-and-error. While this approach has yielded highly optimized kernels for stable architectures like x86, the rapid evolution of RISC-V extensions and hardware makes manual optimization time-consuming, tedious, and demanding of interdisciplinary expertise.
Large language models (LLMs) have shown great promise in automating kernel optimization in other domains, such as CUDA, where extensive technical documents and mature codebases are available. However, their effectiveness in reference-scarce domains like RISC-V has remained unproven until now.
Introducing Evolution of Kernels (EoK)
Researchers have introduced a novel framework called Evolution of Kernels (EoK), an LLM-based evolutionary program search system designed to automate kernel design for domains with limited reference material. EoK tackles the problem of reference scarcity by intelligently mining and formalizing reusable optimization ‘ideas’ from the development histories of established kernel libraries. These ‘ideas’ encompass general design principles combined with actionable thoughts.
EoK then uses these formalized ideas to guide parallel LLM explorations. To further enhance its capabilities, these ideas are enriched via Retrieval-Augmented Generation (RAG) with RISC-V-specific context, including ISA manuals and hardware profiles. This approach prioritizes historically effective techniques, ensuring that the LLMs focus on proven optimization strategies.
How EoK Works
The core principle behind EoK is to learn from past kernel design experiences. It systematically extracts and formalizes general design principles from the development history of well-established kernel libraries, such as OpenBLAS. This process involves analyzing git commits to identify code changes and descriptive messages, then distilling these into actionable thoughts—each with a description, example code, and estimated effectiveness.
These learned ideas are then applied to guide kernel optimization within an evolutionary search framework. EoK creates an initial population of kernels based on these ideas, rather than relying on randomly generated ones. It uses RAG to retrieve relevant context as supplementary prompts for the LLMs, encouraging responses tailored for RISC-V kernel optimization. Multiple searches are carried out in parallel to speed up the overall process.
Also Read:
- Automating GPU Kernel Optimization with LLMs: Introducing Robust-kbench
- AI-Powered Solutions for Flexible Automotive Architectures
Remarkable Performance
Empirically, EoK has achieved impressive results. It delivered a median 1.27 times speedup, surpassing human experts on all 80 evaluated kernel design tasks. Furthermore, it improved upon prior LLM-based automated kernel design methods by a significant 20%. These findings highlight the viability of incorporating human experience into emerging domains and underscore the immense potential of LLM-based automated kernel optimization.
For instance, in a case study on the Mish activation kernel, EoK applied several sophisticated optimization techniques. These included ISA Extension Optimization, leveraging custom instructions for exponential approximation; Instruction-Level Parallelism, using loop unrolling and fused operations; Arithmetic Optimization, introducing a dedicated FP16 arithmetic path; combining multiple functions to reduce overhead; and Precision-Specific Vectorization, optimizing vector register usage for FP16 data types.
The success of EoK demonstrates a scalable pathway to developing high-performance RISC-V kernels, effectively bridging the gap between hardware innovation and deployable software. While EoK significantly accelerates kernel development, the researchers emphasize that human oversight remains indispensable, especially for safety-critical systems. For more details, you can read the full research paper here.


