TREX: CoE for Quantum Monte Carlo Applications
Dirk Pleiter, PDC
Almost a century after Erwin Schrödinger postulated the equation bearing his name, it remains a big challenge to solve this equation in the many-body case. However, this is needed in order to describe quantum mechanical electron problems that occur in the areas of quantum chemistry and condensed matter physics. Therefore, solving this equation is of practical relevance in applied research, for example, for designing novel materials with specific properties. Thanks to numerical methods, computers can help to address this challenge. But the complexity of the problem is extremely high. To cope with the high complexity, either approximate deterministic methods or stochastic techniques are being used. Quantum Monte Carlo (QMC) methods belong to the latter. With the help of QMC methods, properties of molecular and extended systems at the nanoscale can be computed with high accuracy. These types of simulations are highly computationally intensive and require supercomputers like those at PDC.
QMC methods have in common that they use Monte Carlo algorithms to evaluate integrals over a very large number of dimensions. Within such algorithms, numerical results are obtained by computing a very large number of random samples that are created with a defined statistical distribution. While this approach makes it possible to avoid the use of approximations with difficult-to-control systematic errors, the disadvantages of Monte Carlo techniques are their slow convergence properties and the need for large amounts of computational resources to keep statistical errors small. Fortunately, QMC methods are embarrassingly parallel and therefore scale very well on the largest available supercomputers. As a consequence, applications based on QMC methods are likely to become killer applications for upcoming exascale systems and are meanwhile playing an important role in different initiatives to prepare for the upcoming exascale era.
In Europe, such an initiative is driven by the European Commission (EC) through its H2020 program and recently through the EuroHPC Joint Undertaking ( eurohpc-ju.europa.eu ). Through this initiative, various Centres of Excellence (CoEs) have been created, several of them with the involvement of PDC. Most recently, PDC became part of the TREX CoE, where the acronym stands for “Targeting Real chemical accuracy at the EXascale”. It is coordinated by the University of Twente in the Netherlands and also involves partners from Austria, France, Germany, Italy, Poland, Slovakia, and the UK. Most of the partners are developers of QMC applications, while partners like PDC contribute their expertise in high-performance computing.
The most important goal of TREX is to prepare applications for future supercomputers, which is challenging for a variety of reasons. Supercomputer architectures have become significantly more complex and often also heterogeneous. For example, supercomputers like the new Dardel system at PDC leverage not only the compute capabilities of processors but also take advantage of compute accelerators - like graphics processing units (GPUs) - which deliver most of the compute performance. Additionally, the technologies in modern supercomputers have become more diverse. To make investments in complex scientific software sustainable, not only code portability but also performance portability have become key concerns. An application is considered performance-portable if it is able to run efficiently on different architectures. Achieving the aforementioned goal of TREX is challenging - particularly as many of the codes that are currently in use started out being developed decades ago without the developers having any knowledge of today’s computer architectures and without the use of more recent software engineering concepts that make it easier to achieve portability and performance portability.
TREX brings together the developers of various QMC applications that are developed in Europe. This allows for a strategy where performance-critical routines, which several of these applications have in common, can be identified and moved into a library called QMCkl. This new library will not only be developed jointly but will also be developed with performance portability in mind. The ambition of TREX is to deliver applications that can run efficiently on a diverse set of future supercomputers, where the architectures may not even be known yet. These range from supercomputers like Dardel at PDC (with its new AMD GPUs) to future supercomputers with ARM-based processors such as those currently being developed by the European Processor Initiative.
For more information about the TREX CoE, visit trex-coe.eu or contact Dirk Pleiter ( pleiter@kth.se ).