Evaluating Imputation Strategies for Complex Longitudinal Cohort Studies: Accounting for Multi-Level Sampling and Cross-Domain Variable Relationships.
Longitudinal cohort studies represent a cornerstone of population research, enabling the systematic investigation of temporal processes and developmental trajectories among populations. These studies generate comprehensive datasets that traverse multiple domains - encompassing demographic characteristics, social determinants, and health outcomes - with data collection protocols that can vary in both frequency and depth across the observation period. These studies face challenges with missing data due to participant attrition, intermittent non-response, and death. The complexity of appropriately handling missing data is amplified by multi-level sampling designs and the intricate relationships between variables across different domains. Despite numerous available imputation methods, there is limited guidance on optimal approaches that simultaneously account for complex sampling designs and the diverse relationships between variables measured across different domains and time points.
Supervision – Sinéad Moylett (Primary), Blair Robertson (Secondary).
Research Interests – Applied Statistics, The Intersection of Music and Machine Learning, Longitudinal Health Studies, Missing Data, NLP, Logistical Regression.
Academic History – Bachelor of Data Science, Majoring in Data Science, University of Canterbury.
Professional History – TA, Course Facilitator for UC Onlines DATA401, Course Development Work for UCIC’s new pre-master program, Secondary Supervision for DATA60X, Data science consultancy.