Skip to main content

Dardel Updates

Gert Svensson, PDC

This year a number of updates have been made to Dardel. The interconnect has been upgraded from 100 Gbit/s to 200 Gbit/s and is now called Slingshot 11. This means that all the network adaptors have been changed to a new version. PDC has also upgraded the disk system with 50% more capacity, both in terms of the amount of data that can be stored and in relation to the metadata capacity. The water that was used for cooling in the first phase of the Dardel system (that is, the CPU partition) has been changed to a glycol-water mix to avoid biological growth in the system without using any dangerous pesticides.

An HPE technician wearing protective gear while overseeing the process of changing the cooling water

Compared to when the first phase of Dardel was initially installed, the number of CPU nodes has been increased from 518 to 1270. PDC recently performed a new run of the High-Performance Linpack (HPL) Benchmark for the upgraded system to check the performance. The capacity of the CPU partition of Dardel has increased from 2.28 to 4.08 PFLOPS. In the latest TOP500 list from May 2023, the Dardel CPU partition has jumped way up to place 153 (see top500.org/system/180013 ) from place 345 in the previous (November 2022) list. The Dardel graphics processing unit (GPU) partition has an unchanged capacity of 8.26 PFLOPS and is now in place 77 globally (see top500.org/system/180126 ) and comes in at number 25 in the European systems. The GPU partition is still the fifth most energy-efficient system in the world on the Green500 list (see top500.org/lists/green500/2023/06 ).

What remains to be done in the near future is to update the software for the Slingshot communication and the disk system, which will probably take place before the summer.

Overall, all the software in the system needs an update. PDC is discussing with HPE how to do this with minimal user impact. One idea is to install a new small test system at PDC with the latest software pre-installed. This would make it possible to upgrade to and test the latest version without any major impact on the current system. This would also simplify the installation and testing of future new releases.