Dardel is now working as it should
Dardel was upgraded to a different cluster management system called HPCM some time ago. The upgraded system also has a newer version of the Cray programming environment (cpe/23.12), as well as newer versions of most other system software.
The upgrade caused some problems which now appear to be resolved. One problem was that jobs for certain applications and large use cases were hanging. HPE identified the problem and suggested a workaround. Affected users have been informed about the workaround. If you experience hanging jobs, please contact PDC support to test the workaround.
In general, most software running on more than one node needs to be recompiled after the upgrade. Users developing software or maintaining a special version of a standard package may need to recompile the software. PDC has recompiled/reconfigured most of the standard applications for the new software stack.
The general GPU partition only supports ROCM 5.7.0 at this point, since this is the only version HPE supports on this software release. Please contact PDC Support if you need ROCM 6.0 for your research.
To access the new software, you need to load the “PDC/23.12” (or default PDC) module. For example:
ml PDC ml av openfoam --------------- /pdc/software/PDC/23.12/other --------------- openfoam/v2312 openfoam/6 openfoam/9 openfoam/11 (D)
Note: The modules PDC/23.03 and PDC/22.06 contain software that cannot be used on the current system, so they have been renamed as PDCOLD.
Please contact PDC support if you notice any applications that are missing from PDC/23.12 or if you encounter any other issues.