Variational Mixtures and Multi-Marginal Flow Matching
Advancing Statistical Inference with Biological Applications
Time: Fri 2026-05-22 14.00
Location: Kollegiesalen, Brinellvägen 8, Stockholm
Language: English
Subject area: Computer Science
Doctoral student: Oskar Kviman , Beräkningsvetenskap och beräkningsteknik
Opponent: Professor Ben Raphael, Princeton University, Princeton, NJ, USA
Supervisor: Professor Jens Lagergren, Beräkningsvetenskap och beräkningsteknik
QC 20260427
Abstract
In this thesis I develop methods for statistical inference when the distributions arising from complex biological systems are multi-modal, geometrically structured, and sometimes only defined up to a normalizing constant. I start from variational inference and, when analytic update equations are unavailable, move to black-box variational inference. To build intuition regarding inference challenges and the proposed methodologies, I introduce a novel unnormalized target density (the CoLN distribution) and reuse it as a controlled test case in the kappa. I then trace a trajectory of increasingly expressive approximations: ensembles evaluated with the multiple importance sampling ELBO (Paper A) and variational mixtures that automate component cooperation and exploration (Paper B). Because expressivity comes at a cost, I develop efficient mixture learning ideas, including Monte Carlo objective estimators to scale mixture learning more efficiently (Paper C). As a new result in the kappa, I overturn a three decades long misconception regarding the potential performance benefits of using mixtures in variational inference. Finally, I move from variational inference to flow matching, where I address the need for specialized treatment of interpolant learning in multi-marginal settings (Paper D). By combining insights from Papers A–D, I derive in Section 5.5 a new method: multi-marginal flow matching with mixtures of variational interpolants. I connect these methodological developments to biological applications, with special emphasis on three-dimensional spatial transcriptomics, where stacked tissue slices induce multi-modal dynamics across space.