Skip to main content
To KTH's start page

Bayesian Causal Discovery and Object-Centric Representations

Challenges and Insights in Structured Learning

Time: Fri 2025-03-07 10.00

Location: E3, Rum 1563, Osquars backe 18, Campus

Video link: https://kth-se.zoom.us/j/68284213723

Language: English

Subject area: Computer Science

Doctoral student: Amir Mohammad Karimi Mamaghan , Reglerteknik

Opponent: Assistant Professor Francesco Locatello, Institute of Science and Technology (ISTA), Klosterneuburg, Austria

Supervisor: Professor Karl H. Johansson, Reglerteknik; Professor Stefan Bauer, Technical University of Munich & Helmholtz AI, Munich, Germany

Export to calendar

QC 20250212

Abstract

Causality and Representation Learning are foundational to advancing AI systems capable of reasoning, generalizing, and understanding the complex structure of the world. Causality provides tools to uncover the underlying causal structure of a system, understand cause-effect relationships, and reason about interventions. Representation Learning, on the other hand, transforms raw data into structured abstractions essential for modeling the underlying system and decision-making. Causal Representation Learning bridges these paradigms by using representation learning to extract high-level abstractions and entities and integrating causal reasoning principles to uncover cause-effect relationships between these entities. This approach is crucial for real-world systems, where causal relationships are typically defined between high-level entities, such as objects or interactions, rather than low-level sensory inputs like pixels. This thesis explores two key paradigms presented as a collection of two papers: the challenges in the evaluation of Bayesian Causal Discovery, and the effectiveness of structured representations, with a focus on object-centric representations in visual reasoning.

In the first paper, we study the challenges in the evaluation of Bayesian Causal Discovery methods. By analyzing existing metrics on linear additive noise models, we find that current metrics often fail to correlate with the true posterior in high-entropy settings, such as with limited data or non-identifiable causal models. We highlight the importance of considering posterior entropy and recommend evaluating Bayesian Causal Discovery methods on downstream tasks, such as causal effect estimation, for more meaningful evaluation in such scenarios.

In the second paper, we investigate the effectiveness of object-centric representations in visual reasoning tasks, such as Visual Question Answering. We reveal that while large foundation models often match or surpass object-centric models in performance, they require larger downstream models and more compute due to their less explicit representations. In contrast, object-centric models provide more interpretable representations but face challenges on more complex datasets. Combining object-centric representations with foundation models emerges as a promising solution, reducing computational costs while maintaining high performance. Additionally, we provide several additional insights such as segmentation performance versus downstream performance, and the effect of factors such as dataset size and question types, to further improve our understanding of these models.

urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-359733