Correcting Biases in Multi-Module Neural Networks, Through Efficient Hyperparameter Optimization and Statistically Meaningful Uncertainty Quantification, with Applications to Neurological Disorders
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Content Notes
Abstract
Deep learning is a branch of machine learning that employs artificial neural networks to produce inferences from data. These networks have been successfully applied to a plethora of clinically and biologically related problems, including prognosis and diagnosis for a broad spectrum of diseases. However, many applications have focused on single tasks and/or single modal data, and thus use networks consisting of a single module. Modules are sections of a complete network, each that carries out a specific task. Multi-module artificial neural networks take inspiration from mammalian brains that process different types of input via dedicated brain regions (e.g. optical information by the visual cortex, sound, and language via Broca's and Wernicke's areas) and subsequently integrates them into a unified representation of our surroundings. While much work has been done on networks with a single module, significantly less work has focused on multi-module networks. Such networks are especially important in clinical problems that require the integration of multi-modal information and/or extracting multiple representations from the same input to provide high predictive performance. This dissertation corrects biases in multi-module networks for clinically relevant neurological problems. It develops novel hyperparameter search strategies with significantly improved performance of multi-module networks for diagnoses and prognoses, and adds uncertainty quantification to empower multi-module mixed effects deep learning models with the ability to produce statistically meaningful measures of covariate significance and principled probabilistic prediction confidence. Concretely, in this dissertation, I demonstrate the ability of multi-module deep learning networks to integrate spatial and temporal information and automate the detection of artifacts in magnetoencephalography (MEG) brain recordings of subjects including control subjects and those with a head injury. Recognizing how important the optimization of the network architecture was to achieve high predictive performance motivated the development of a novel module adaptive hyperparameter optimization (MA) hyperparameter search framework, which increases the efficiency of architecture optimization of multi-module networks. This approach is demonstrated to identify more optimal architectures when compared to other search strategies and to significantly increase the predictive performance of Alzheimer's and Parkinson's disease prognoses. Finally, I empower mixed effects deep learning (MEDL) models, which explicitly use multi-module networks, with uncertainty quantification allowing for the calculation of fundamental statistical metrics of model fit, covariate coefficient estimation, and prediction confidence. This model is then applied to predict which subjects will convert from mild cognitive impairment to Alzheimer's disease.