Dardel Cloud
Dirk Pleiter, PDC
In the past, the primary role of high-performance computing (HPC) centres has been to operate and provide access to one or more supercomputers. However, as the complexity and capabilities of HPC systems increase, and as our utilisation of them becomes more demanding, the role of HPC centres is changing to adapt to these ongoing developments. In the future, these centres will transform into service providers where the services they provide are based on different types of underlying computing and storage resources, rather than a single supercomputer. While the resources on offer will continue to include HPC systems, adding on-premise instances of private clouds will become the default. (For anyone unfamiliar with the terminology, a “private cloud” refers to a set of computational and data storage resources that are based on the same technologies as those used by big public cloud providers, such as Amazon Web Services (AWS). AWS provides on-demand computing processing capacity, data storage and software on a pay-as-you-go basis to companies, government organisations and individuals through a global network of “farms”, each containing many computers. In contrast, a private cloud uses a different business model, and the infrastructure is located on the premises of the organisation that runs the cloud.)
With these architectural changes on the way, HPC centres are responding to changing user needs, as well as those of researchers in emerging new science and engineering domains who need HPC resources for their research. One trend to highlight encompasses the efforts of HPC centres towards establishing domain-specific platform services. An example of this is the brain research community which is establishing a pan-European research infrastructure called EBRAINS. The services offered by this platform need to be deployed in a flexible environment, which is also stable. This is best realised through the use of virtual servers. Amongst other things, EBRAINS will allow end users to run HPC workflows through open portal services. Another trend is the emergence of workflows that extend beyond a single HPC centre. Cloud instances allow the deployment of domain-specific services that effectively extend a data centre and make it possible, for example, to connect HPC model simulations or data processing to external data sources.
PDC is in the process of deploying such a private cloud instance, which will be known as the Dardel Cloud. It will feature more than a thousand physical CPU cores that can be offered to users as an even larger number of virtual CPUs. This not only adds a significant amount of compute capability, but also enables researchers and other services providers to deploy their services in a robust environment. These could range from Jupyter Notebook services for training to database services through to data analytics workflow services.
Another part of the Dardel Cloud system will be a large-scale storage system based on a technology called Ceph. It comes with interfaces that are widely used by the big public cloud providers and are designed for supporting the sharing of data, unlike the technology used for Dardel's parallel file system, Klemming. One reason to use the Ceph technology is that it supports authentication mechanisms which make it easy to integrate with providers of virtual identities, and therefore it is possible to avoid having to create local accounts for new users. For example, SUNET (the Swedish University Computer Network which provides high-speed internet access to academic institutions in Sweden) offers a service known as SWAMID where any Swedish researcher can obtain such a virtual identity. While the Klemming-based storage for Dardel is optimised for performance, the Ceph storage will be optimised for capacity that can be easily extended. PDC is thus laying the foundations for coping with the increasing demands for flexible data management as well as storage capacity.
Any researchers who are interested in obtaining early access to the Ceph system are warmly encouraged to contact PDC. Information about contacting PDC can be found at www.pdc.kth.se/support/documents/contact/contact_support.html .