Inter and intra-tumor models of somatic evolution in cancer
Time: Thu 2023-02-23 14.00
Location: Air & Fire, Scilifelab, Tomtebodavägen 23A, 171 65, Solna, Solna
Subject area: Applied and Computational Mathematics
Doctoral student: Mohammadreza Mohaghegh Neyshabouri , Beräkningsvetenskap och beräkningsteknik (CST)
Opponent: Professor Ben Raphael, Princeton University, Princeton, NJ, USA
Supervisor: Professor Jens Lagergren, Beräkningsvetenskap och beräkningsteknik (CST)
Cancer is a disease caused by the accumulation of somatic mutations in an evolutionary process. Mutations in so-called cancer driver genes provide the harboring cells with particular selective advantages and result in cancer progression. Identification of the driver genes and their interrelations is critical for a wide range of research and clinical applications. This thesis investigates the problem of modeling the cancer evolution dynamics using probabilistic cancer progression models. Such models aim to explain the mechanism of accumulation of mutations in the tumor cells and how specific mutations may exert promoting or inhibiting effects on each other. We introduce a set of computational methods to analyze cross-sectional data from a cohort of tumors and infer the interrelations among cancer driver genes, represented by a graphical structure over them.
In our first two papers, following the typical setting in the cancer progression model studies, we use a simple representation for the tumors in which a single genotype vector models each tumor. We introduce a pathway linear progression model in the first paper and a generalized tree-structured model in the second. Using novel dynamic programming procedures for calculating the likelihoods, we build Markov Chain Monte Carlo (MCMC) inference algorithms for our models in these papers. Using these fast and efficient MCMC algorithms enables us to study massive datasets that were infeasible to be investigated by previously introduced methods.
In our third paper, we introduce a framework for taking a finer representation of the tumors into account for inferring progression models. With the rapid improvements in the amount and quality of available data, we can now work with vast numbers of reliably reconstructed tumor clonal trees. In our third paper, we introduce a method that takes such clonal trees from cohorts of tumors as its input and identifies the interrelations among the driver genes within a single tumor or across different tumors. We propose an MCMC algorithm with guided proposal distributions, which substantially increase the algorithm's efficiency in exploring the high-probability regions. The rich input data and the computationally efficient algorithm introduced in this paper provide promising results on a set of synthetic and biological data experiments.