High-Performance Network-on-Chip Design for Many-Core Processors
Time: Mon 2020-11-02 13.00
Location: zoom link for online defence (English)
Subject area: Information and Communication Technology
Doctoral student: Boqian Wang , Elektronik och inbyggda system, Network-on-Chip
Opponent: Associate Professor Kun-Chih Chen, Department of Computer Science and Engineering, National Sun Yat-Sen University, Taiwan
Supervisor: Professor Zhonghai Lu, Elektronik och inbyggda system
Abstract
With the development of on-chip manufacturing technologies and the requirements of high-performance computing, the core count is growing quickly in Chip Multi/Many-core Processors (CMPs) and Multiprocessor System-on-Chip (MPSoC) to support larger scale parallel execution. Network-on-Chip (NoC) has become the de facto solution for CMPs and MPSoCs in addressing the communication challenge. In the thesis, we tackle a few key problems facing high-performance NoC designs.
For general-purpose CMPs, we encompass a full system perspective to design high-performance NoC for multi-threaded programs. By exploring the cache coherence under the whole system scenario, we present a smart communication service called Advance Virtual Channel Reservation (AVCR) to provide a highway to target packets, which can greatly reduce their contention delay in NoC. AVCR takes advantage of the fact that we can know or predict the destination of some packets ahead of their arrival at the Network Interface (NI). Exploiting the time interval before a packet is ready, AVCR establishes an end-to-end highway from the source NI to the destination NI. This highway is built up by reserving the Virtual Channel (VC) resources ahead of the target packet transmission and offering priority service to flits in the reserved VC in the wormhole router, which can avoid the target packets’ VC allocation and switch arbitration delay. Besides, we also propose an admission control method in NoC with a centralized Artificial Neural Network (ANN) admission controller, which can improve system performance by predicting the most appropriate injection rate of each node using the network performance information. In the online control process, a data preprocessing unit is applied to simplify the ANN architecture and make the prediction results more accurate. Based on the preprocessed information, the ANN predictor determines the control strategy and broadcasts it to each node where the admission control will be applied.
For application-specific MPSoCs, we focus on developing high-performance NoC and NI compatible with the common AMBA AXI4 interconnect protocol. To offer the possibility of utilizing the AXI4 based processors and peripherals in the on-chip network based system, we propose a whole system architecture solution to make the AXI4 protocol compatible with the NoC based communication interconnect in the many-core system. Due to possible out-of-order transmission in the NoC interconnect, which conflicts with the ordering requirements specified by the AXI4 protocol, in the first place, we especially focus on the design of the transaction ordering units, realizing a high-performance and low cost solution to the ordering requirements. The microarchitectures and the functionalities of the transaction ordering units are also described and explained in detail for ease of implementation. Then, we focus on the NI and the Quality of Service (QoS) support in NoC. In our design, the NI is proposed to make the NoC architecture independent from the AXI4 protocol via message format conversion between the AXI4 signal format and the packet format, offering high flexibility to the NoC design. The NoC based communication architecture is designed to support high-performance multiple QoS schemes. The NoC system contains Time Division Multiplexing (TDM) and VC subnetworks to apply multiple QoS schemes to AXI4 signals with different QoS tags and the NI is responsible for traffic distribution between two subnetworks. Besides, a QoS inheritance mechanism is applied in the slave-side NI to support QoS during packets’ round-trip transfer in NoC.