Skip to main content
To KTH's start page To KTH's start page

Homology and machine learning for materials informatics

Time: Fri 2023-03-24 15.00

Location: Hörsal 4, Hus 2, Albanovägen 18

Language: English

Subject area: Physics, Theoretical Physics

Doctoral student: Bart Olsthoorn , Nordic Institute for Theoretical Physics NORDITA, Kondenserade materiens teori

Opponent: Professor O. Anatole von Lilienfeld, University of Toronto, Kanada

Supervisor: Jens H. Bardarson, Kondenserade materiens teori; Alexander V. Balatsky, Nordic Institute for Theoretical Physics NORDITA

Export to calendar

QC 230227


Materials informatics is the field of study where materials science is combined with modern data science. This data-driven approach is powered by the growing availability of computational power and storage capability. The development and application of these methods accelerates materials science and represents an effective way to study and model material properties. This thesis is a compilation of theoretical and computational works that can be divided into three key areas: materials databases, machine learning for materials, and homology for materials.

Machine learning and data mining rely on the availability of materials databases to test methods and models. The Organic Materials Database (OMDB), for example, contains a large number of organic crystals and their corresponding electronic structures. The electronic properties of the organic crystals are computed using atomic scale materials modelling, which is computationally expensive because organic crystals typically contain many atoms in the unit cell. However, the resulting data can be used in a variety of materials informatics applications. We demonstrate data mining for dark matter sensors as an example application.

Accurate machine learning models can capture the structure-property relationship of materials and accelerate the discovery of new materials with desired properties. This is explored by investigating the properties of the organic crystals in the OMDB. For example, we employ supervised learning on the electronic band gap, an important material property for technological applications. Unsupervised learning is used to construct a dimensionality-reduced chemical space that reveals interesting clusters of materials.

Finally, persistent homology is a relatively new method from the field of algebraic topology that studies the shapes that are present in data at different length scales. In this thesis, the method is used to study magnetic materials and their phase transitions. More specifically, in the case of classical models, we use persistent homology to detect the phase transition directly from sampled spin configurations. For quantum spin models, the shapes in the entanglement structure are captured and a sudden change reveals a quantum phase transition.

In summary, these three topics provide an overview on how to study material properties with modern data science methods. The tools can be used in combination with the traditional methods in materials science and accelerate materials design.