Confirm

LeapFive News

RISC-V + DSA: Yuan Bohu of Ruidong Technology Unveils the Technical Path for Reshaping the Chip-Computing Landscape


 

From July 16 to 18, the 2025 RISC-V China Summit was successfully held in Shanghai. As an innovator in domain-specific computing based on RISC-V, YF Technology was invited to attend the High-Performance Computing Sub-forum of the summit. Yuan Bohu, Chief Operating Officer (COO) of the company, delivered a keynote speech titled “RISC-V + DSA: An Inevitable Choice for Reshaping the Chip-and-Compute Landscape,” in which he delved into the technical path of combining the RISC-V architecture with domain-specific accelerator architectures (DSA) and explored the promising applications of this combination in the field of high-performance computing.

 

RISC-V + DSA: A New Paradigm for Unleashing Hardware Potential

In his speech, Yuan Bo Hu pointed out that with the explosive growth in demand for AI and high-performance computing, traditional chip architectures are facing challenges such as the stacking of computing power and rising energy consumption. The openness and flexibility of the RISC-V architecture, combined with the specialized design of DSA, can more efficiently unleash hardware performance.
 

He further explained that DSA significantly improves computational energy efficiency by optimizing the instruction set and hardware resource allocation for specific computing tasks. For example, in AI inference scenarios, customized instruction sets and cache mechanisms can reduce data movement overhead, boosting computational efficiency by several times. This “software-hardware co-design” approach enables chips to achieve superior AI performance without relying on higher computational power.

 

Against the backdrop of slowing process evolution, the value of DSA becomes even more prominent. Yuan Bohu emphasized: “In the future, we need to enhance the energy efficiency ratio by refining DSA’s specialized domain architecture, while also deconstructing different inspection lines from a process perspective to achieve integrated resource utilization tailored to specific application scenarios.” This architecture-level optimization can unlock greater computational potential under current process levels through collaborative software-hardware design.

 

Technical Implementation Path for the DSA Architecture

Yuan Bo Hu elaborated in detail on the DSA architecture’s complete technological roadmap—from algorithms to chips. This roadmap can be divided into three key levels: instruction set extension design, chiplet design, and chip design.

 

At the instruction-set level, the core of optimization lies in deeply customizing the instruction set for specific computational patterns. Taking the Energon algorithm as an example, matrix multiplication can be optimized by extending the RISC-V instruction set. Matrix Multiplication ), hardware-level acceleration for operations such as SoftMax can achieve energy efficiency improvements of up to 10,000 times and a 1,000-fold increase in computational speed. This optimization not only enhances computational efficiency but also unlocks the full potential of the hardware.
 

At the chiplet design level, the focus is on enhancing the execution efficiency within the chip itself. Yuan Bohu pointed out that by optimizing single-issue/multi-issue mechanisms, improving cache access strategies, and reducing data movement overhead, chips can achieve significant performance gains under the same process conditions. Moreover, the introduction of a software-shared memory mechanism further boosts multi-core collaboration efficiency, providing a more efficient execution environment for complex computing tasks.
 

In chip design, the DSA architecture emphasizes the development of specialized acceleration engines tailored to different computational workloads. Yuan Bohu specifically pointed out, “We need to build computing engines that are specifically designed for the objects corresponding to a particular type of computation.” This concept drives the creation of scalable combinations of computing units and efficient memory subsystems—starting from the expandability of computational objects—thereby achieving extreme optimization for specific workloads.

 

Key Challenges and Solutions for Industrialization

Although the DSA architecture demonstrates significant technical advantages, its industrialization still faces numerous challenges. Yuan Bohu analyzed these issues from three perspectives—cross-vendor collaboration, infrastructure development, and software ecosystem—and proposed corresponding solutions.

 

The cross-vendor collaboration standard is the primary issue that needs to be addressed for the industrialization of DSA. Yuan Bohu believes that “it’s definitely difficult for a single vendor to accomplish something like this on its own,” so it’s essential to establish unified IP interface specifications and interconnection standards to ensure seamless collaboration among DSA modules from different vendors. At the same time, support for multiple process nodes is also crucial—it guarantees that the architecture can operate efficiently across various processes such as 12nm, 16nm, and 28nm.

 

In terms of infrastructure development, Yuan Bohu proposed solutions such as a centralized data exchange architecture and a reliable interconnection mechanism. These technologies not only optimize the efficiency of data transmission among multiple cores but also enhance system stability. Moreover, a comprehensive simulation toolchain is crucial for accelerating chip design verification.

 

The development of a software ecosystem is equally crucial and cannot be overlooked. Yuan Bohu emphasized, “We need to build an open ecosystem that allows different companies to contribute their own accelerator architecture designs.” Achieving this goal requires the support of high-performance emulators and open software stacks to lower the barrier for developers and accelerate the deployment of applications. Yuefang Technology has already developed a high-performance emulator that supports the full instruction set of RVA23, providing strong assurance for early-stage software development.

 

Finally, Yuan Bo Hu proposed that, in terms of building the RISC-V + DSA ecosystem, Yueshang Technology is taking active steps. As a core initiating entity, it has established the world’s first industry innovation collaboration organization based on RISC-V DSA—the RDSA Industry Alliance—which has already attracted more than 50 partners from both domestic and international markets. Together, these partners are exploring key technologies such as heterogeneous computing interconnectivity and protocol-layer optimization, and jointly promoting the development of the industry ecosystem.

 

The DSA+RISC-V technology roadmap proposed by Yuefang Technology offers a new approach to the development of the chip industry. By deeply optimizing the interplay among algorithms, instruction sets, and hardware, the DSA architecture can achieve breakthroughs in both energy efficiency and performance. Moreover, the establishment of an open ecosystem will further accelerate the industrial implementation of these technologies. In the future, this technological path is expected to unlock even greater value in fields such as AI and high-performance computing, driving the chip industry to new heights.