Skip to main content
To KTH's start page

Practical and Efficient Transfer Learning with Foundation Models for Medical Image Analysis

Time: Mon 2026-06-01 13.00

Location: Kollegiesalen, Brinellvägen 8, Stockholm

Video link: https://kth-se.zoom.us/s/61380111041

Language: English

Subject area: Computer Science

Doctoral student: Moein Sorkhei , Beräkningsvetenskap och beräkningsteknik

Opponent: Karim Lekadir, Universitat de Barcelona

Supervisor: Professor Kevin Smith, Beräkningsvetenskap och beräkningsteknik; Hossein Azizpour, SeRC - Swedish e-Science Research Centre

Export to calendar

QC 20260508

Abstract

Foundation models – large-scale AI models trained on vast and diverse data – have reshaped modern artificial intelligence by enabling powerful knowledge transfer across tasks and domains. By learning broad, general-purpose representations, foundation models make it possible to adapt a single pre-trained model to a wide range of downstream applications. Despite this progress, translating these models into practical real-world impact remains challenging. Key challenges involve determining when knowledge transfer genuinely provides benefits, adapting ever-larger models within practical computational limits, and deploying them effectively in high-stakes, data-scarce environments such as clinical settings.

This thesis investigates methods for making transfer learning with modern foundation models more reliable, efficient, and applicable in real-world settings. It first addresses the challenge of determining when transfer learning is likely to be beneficial. Through a large-scale empirical study across diverse domains, tasks, and model architectures, the results show that existing transferability estimation metrics often fail to provide reliable guidance under realistic deployment conditions. To address this limitation, a simple and robust estimator is introduced to better predict the expected gains from transfer learning before costly adaptation is undertaken.

Despite the potential advantages of large foundation models, their practical use is often constrained by their scale. With billions of parameters, adapting these models to a specific downstream domain is computationally expensive and frequently infeasible under typical resource limitations. This thesis addresses this challenge by proposing an efficient adaptation strategy that substantially reduces computational cost while preserving performance. Furthermore, to broaden applicability in real-world settings, this efficient adaptation framework is extended to domains characterized by limited labeled data. In domains such as medical imaging, expert annotation is costly and scarce, while unlabeled data is more readily available. An adaptation approach is therefore developed that enables foundation models to be effectively aligned with medical domains in settings with limited labeled data.

Finally, the thesis focuses on a concrete clinical application in mammography, a domain characterized by limited public datasets and high annotation costs. A large-scale, expert-annotated dataset is introduced and a clinically motivated prediction task is defined, aimed at estimating cancer masking potential – an important factor affecting diagnostic reliability in breast cancer screening. Deep learning models are shown to effectively address this task, highlighting its relevance for supporting risk-informed clinical decision-making and real-world deployment.

Link to DiVA