About the Dardel Cloud
The Dardel Cloud compute and storage platform is currently being established. Pilot testing of initial stages of the cloud is underway.
Why build a Dardel Cloud?
The role of high-performance computing (HPC) centres, like PDC, is changing as the complexity and capabilities of HPC systems increase and as our utilisation of them becomes more demanding. To date, the primary role of HPC centres has been to operate and provide access to one or more supercomputers. However, in the future, these centres will transform into service providers where the services they offer are based on different types of underlying computing and storage resources, rather than a single supercomputer. Consequently, HPC centres will also provide services based on cloud technologies. By establishing the Dardel Cloud, PDC is beginning this transition.
What is the Dardel Cloud?
The Dardel Cloud is a secure computation and storage platform that can be used for research, either on its own or together with the Dardel HPC system.
The Dardel Cloud is being established in several stages. The first stage of the cloud offers Infrastructure as a Service (IaaS) in the form of Virtual Machines (VM) of different sizes and operating environments. It will also include a large-scale storage system with user interfaces that will make it easier to share research data in accordance with the FAIR principles, which aim to ensure that data is “Findable, Accessible, Interoperable and Reusable”. (This is in contrast to the technology used for Dardel’s parallel file system, Lustre, which is designed for fast data access to optimise computational performance.) The plan is for more types of services, such as container runtime environments, to be added to the Dardel Cloud in the future, and for the cloud storage to be extended.
Services offered via the Dardel Cloud
Computation
The Dardel cloud will feature 8TB of memory and over a thousand physical CPU cores that can be utilised as an even larger number of virtual CPUs through virtual machines. The VMs in the Dardel Cloud are hosted on servers that have AMD EPYC 7713 processors. These servers run the hypervisor software which creates and runs the VMs in the Dardel Cloud. These hypervisor machines are interconnected to each other via a network with a top speed of 100 Gbps and are also connected to the KTH campus network with the same speed.
Researchers will be able to choose to use various types of VMs that will feature different numbers of CPU cores and different memory capacities to suit a range of diverse computational needs. This will enable researchers and other services providers to deploy their services in a robust environment. (Some examples of such types of services include Jupyter Notebook services for training, database services and data analytics workflow services.) People who use the Dardel Cloud will also be able to benefit from a concept called Floating IPs (a type of virtual IP address) to make their portals and web services publicly available for other users.
Data storage
- The Dardel Cloud platform includes a large-scale object store based on Ceph (a highly scalable open-source software-defined storage solution designed to address today’s high-growth storage requirements).
- The initial gross capacity of this storage is slightly above 1 PB, and the capacity is likely to be extended in the future.
- This object store will initially be made available via an S3 interface.
- It is easy to access the Dardel Cloud as new users do not have to create an account on the cloud if they already have a virtual identity, such as a SWAMID identity provided by the Swedish University Computer Network (SUNET). The underlying Ceph technology supports identity authentication mechanisms which can be integrated with various virtual identity providers.
Other services currently available for or through the Dardel Cloud
- user support
- information about project quotas
Note that there is a wide range of possible services that may be included in the Dardel Cloud services catalogue in the future, such as a container orchestration platform, high availability services, load balancing and big data analytics.
What research is the Dardel Cloud best suited to?
- Research projects needing cloud computing and/or cloud storage resources, particularly where scientific computing assistance from PDC’s application experts/research software engineers) would be beneficial
- Research projects whose workflows require cloud computing and/or cloud storage services to be used in conjunction with HPC resources that are in close proximity to the cloud services so there are very fast connections between the cloud services and the HPC resources (like the CPU and/or GPU partitions of the main Dardel HPC system at PDC)
- Research projects with workflows that extend beyond a single HPC centre, such as where cloud instances are used to connect HPC model simulations or data processing to external data sources - As an example, the Dardel Cloud will enable researchers to run their HPC or AI workloads on resources at any National Academic Infrastructure for Supercomputing in Sweden ( NAISS ) site and then transfer the resulting data to the Dardel Cloud’s Ceph storage. Then the data can be used in tasks like post-processing, data visualisation (that is, creating images, graphs or maps to help people visualise the “meaning” of the data), or displaying aspects of the data on a website using a web application (for instance, for making weather forecast maps available to the general public).
What type of researcher is the Dardel Cloud for?
The Dardel Cloud can be used by any researchers who are using HPC systems or data storage for their research, however the Dardel Cloud is particularly suitable for researchers who wish to build and run their own application stacks in dedicated virtual environments, either directly on virtual machines or in a containerised environment.
What can researchers do with the Dardel Cloud?
People using the Dardel Cloud will be able to
- deploy domain-specific portal services,
- host research-related websites and web applications,
- connect web portals to the Dardel Ceph storage backend via the S3 storage gateway,
- run containerised applications, such as Docker and Singularity,
- run workflows that are less compute-intensive than those suited to the Dardel HPC system,
- set up test and development environments,
- utilise virtual machines for hosting databases, and
- implement data curation pipelines and data sharing services, as well as undertaking interactive and collaborative work using interactive frameworks like Jupyter notebooks.
Using the Dardel Cloud
When will the Dardel Cloud be available?
The first stage of the Dardel Cloud is expected to be available for general use in the autumn of 2024. For the time being, the service is in a pilot mode. Any researchers who are interested in obtaining early access to it should contact PDC Support (see www.pdc.kth.se/support/documents/contact/contact_support.html ).
Information about how to use the Dardel Cloud
Information about using the Dardel Cloud will be available on the PDC Support pages later in 2024. Please check back here for the link later.