Seminar: To the exascale and beyond
You can attend in person at KTH or via Zoom
Dr. Scott Klasky, an esteemed scientist from Oak Ridge National Laboratory in the USA, will talk about handling the data deluge now and as we progress beyond exascale.
Time: Mon 2023-09-25 13.00 - 14.00
Video link: https://kth-se.zoom.us/j/62718827021
Dr. Scott A. Klasky is a distinguished scientist and the group leader for Workflow Systems in the Computer Science and Mathematics Division at the Oak Ridge National Laboratory. He holds an appointment at the University of Tennessee, and Georgia Tech University. He obtained his Ph.D. in Physics from the University of Texas at Austin (1994). Dr. Klasky is a world expert in scientific computing, scientific data reduction, and scientific data management, co-authoring over 300 papers and is one of the key leaders of over a dozen Department of Energy projects.
As the exascale era continues to progress, we already started to postulate on what life will be after exascale. HPC companies, such as NVIDIA with its Omniverse platform, envision a future in which teams work collaboratively across their institutions on digital twins and navigate an ocean of data. Department of Energy Laboratories aim at integrating research infrastructures, from experimental and observational instruments to computational and data facilities into "superfacilities". The overarching vision behind the Research and Development that Dr. Klasky's group has been working on for the past 25 years has been to create a software infrastructure to facilitate fast data movement, prioritizing information over data, and automating scientific processes for Near Real Time scientific knowledge discovery.
The rapid growth in technology is providing unprecedented opportunities for science and putting data at the centre of scientific research. However, dealing with the deluge of data produced by scientific instruments has resulted in a crisis. Computer speeds are increasing much faster than network and storage technology capacities and I/O rates. As a result, processing data becomes easy and cheap, while storing and moving data efficiently becomes more and more challenging. This is even getting worse as the data production rates of experimental and observational facilities explode. For instance, the Vera Rubin Observatory (VRO) will collect more than 20 TB per night from 2022 while the Square Kilometer Array Telescope (SKA) will generate over 2 PB per night in 2028. This reality makes it critical for our community to
- create efficient mechanisms to move and store the data in a Findable, Addressable, Interoperable, and Reproducible (FAIR) fashion;
- create abstractions so that scientists can perform both online and offline analysis in an efficient way;
- create new algorithms not only to reduce/compress data but also to be trusted by the scientific community for later post processing as well as reducing the memory footprint and the overall time spent in analysis; and
- create next-generation Workflow Management Systems (WMS) which utilize these technologies for enabling a collaborative command and control framework for large scale experiments and simulations, such as ITER and XGC.
This talk will focus the presentation on five of the group's software technologies which have been integrated into many applications:
- ADIOS - an exascale I/O framework,
- SST – a data streaming technology which is integrated into ADIOS,
- MGARD ,
- EFFIS and
- eSimMon which is a collaborative, intuitive web application that allows real-time analysis of results produced by high-performance simulations.
Dr. Klasky will also focus on several R&D efforts his group is currently focused on, looking forward to gaining key collaborators in these efforts.