Automated Optimizations for Inference in Probabilistic Programming Languages
Time: Mon 2024-10-14 13.00
Location: Room Ada, Electrum, Kistagangen 16
Video link: https://kth-se.zoom.us/j/66846620144
Language: English
Subject area: Computer Science
Doctoral student: Gizem Çaylak , Programvaruteknik och datorsystem, SCS
Opponent: Senior Lecturer Christoph Reichenbach, Lund University, Lund, Sweden
Supervisor: Professor David Broman, Programvaruteknik och datorsystem, SCS; Professor Magnus Boman, Karolinska Institutet, Stockholm, Sweden; Emma Granqvist, Naturhistoriska Riksmuseet, Stockholm, Sweden
QC 20240918
Abstract
Probabilistic programming languages (PPLs) provide users with an interface to write probabilistic models and leave the statistical inference to the compiler and runtime systems. Ideally, the purpose of PPLs is to provide a seamless inference interface that makes the model and the inference parts modular. Yet, the structure of the user-defined model affects inference accuracy and execution time. Users can utilize certain structures, such as conjugate prior relations or tree structures within a given model, to improve inference efficiency; however, manually utilizing these structures in a model is time-consuming and error-prone. Especially as the PPL's expressiveness increases, the models may become too complex to rewrite. This thesis tackles the problems related to the automation of optimizations, improving the execution time and the accuracy of the inference based on the structure of the model. Further, it aims to make these optimizations expressive enough to handle complex models written in universal PPLs.
The scope of this thesis is to optimize Bayesian inference in universal PPLs. The three contributions presented propose automated optimization approaches to improving the inference accuracy and execution time while preserving the expressiveness of the optimized program. The first contribution considers dynamic delayed sampling, a runtime algorithm to reduce inference variance through analytical relationships between random variables. We develop a compile-time version of the delayed sampling algorithm to reduce execution overhead. The second contribution regards the design and implementation of dynamic delayed sampling in a statically typed universal PPL, Miking CorePPL. Integrating delayed sampling into a statically typed PPL requires introducing additional constructs to the language. The third contribution is related to the forward pass of belief propagation algorithm, an efficient inference algorithm on tree graphs. We propose an approach automating this forward pass of belief propagation in statically typed universal PPLs to optimize likelihood calculations by utilizing the tree structures in a model. We evaluate the proposed approaches on real-world problems, such as topic modeling and phylogenetic problems.