Optimizing Neural Network Models for Healthcare and Federated Learning
Time: Wed 2024-05-29 09.30
Location: Webinar, Sal C (Sven-Olof Öhrvik) at Electrum, Kungliga Tekniska högskolan, Kistagången 16, Kista
Video link: https://kth-se.zoom.us/j/64067570049
Language: English
Subject area: Computer Science
Doctoral student: PhD Student Giacomo Verardo , Network Systems Laboratory (NS Lab)
Opponent: PhD, Research Scientist Maxime Sermesant, INRIA, French National Institute for Research in Digital Science and Technology
Supervisor: Dejan Kostic, Network Systems Laboratory (NS Lab); Marco Chiesa, Network Systems Laboratory (NS Lab)
This research leading to this thesis is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA) under Award No. ORA-CRG2021-4699
Abstract
Neural networks (NN) have demonstrated considerable capabilities in tackling tasks in a diverse set of fields, including natural language processing, image classification, and regression. In recent years, the amount of available data to train Deep Learning (DL) models has increased tremendously, thus requiring larger and larger models to learn the underlying patterns in the data. Inference time, communication cost in the distributed case, required storage resources, and computational capabilities have increased proportional to the model's size, thus making NNs less suitable for two cases: i) tasks requiring low inference time (e.g., real-time monitoring) and ii) training on low powered devices. These two cases, which have become crucial in the last decade due to the pervasiveness of low-powered devices and NN models, are addressed in this licentiate thesis.
As the first contribution, we analyze the distributed case with multiple low-powered devices in a federated scenario. Cross-device Federated Learning (FL) is a branch of Machine Learning (ML) where multiple participants train a common global model without sharing data in a centralized location. In this thesis, a novel technique named Coded Federated Dropout (CFD) is proposed to carefully split the global model into sub-models, thus increasing communication efficiency and reducing the burden on the devices with only a slight increase in training time. We showcase our results for an example image classification task.
As the second contribution, we consider the anomaly detection task on Electrocardiogram (ECG) recordings and show that including prior knowledge in NNs models drastically reduces model size, inference time, and storage resources for multiple state-of-the-art NNs. In particular, this thesis focuses on AEs, a subclass of NNs, which is suitable for anomaly detection. I propose a novel approach, called FMM-Head, which incorporates basic knowledge of the ECG waveform shape into an AE. The evaluation shows that we improve the AUROC of baseline models while guaranteeing under-100ms inference time, thus enabling real-time monitoring of ECG recordings from hospitalized patients.
Finally, several potential future works are presented. The inclusion of prior knowledge can be further exploited in the ECG Imaging (ECGI) case, where hundreds of ECG sensors are used to reconstruct the 3D electrical activity of the heart. For ECGI, the reduction in the number of sensors employed (i.e., the input space) is also beneficial in terms of reducing model size. Moreover, this thesis advocates additional techniques to integrate ECG anomaly detection in a distributed and federated case.