Menu

Profile image
Topic

Quantifying Initial-Sample Sensitivity in Active Learning

22 May 2026

Host Faculty: Engineering

General Subject Area: Machine Learning

Project Level: Master's

HOW TO APPLY

Overall Goal:

-----------------

To investigate how the initial choice of training data affects the performance and reliability of active learning methods, and to identify which methods are more robust when early data are biased or unrepresentative.

Details:

---------

Machine-learning models often start with only a small amount of labeled data, which is then expanded gradually as new data are collected. If the new data is collected in a strategic manner to best inform the model, this process is called “active learning”. In practice, the very first samples used for training can strongly influence the success of active learning methods and the model itself. If these early samples are biased, incomplete, or unrepresentative, the model may perform poorly, even if more data are added later.

This project studies how sensitive different data-efficient learning strategies are to their initial training data. The student will systematically construct different types of “problematic” initial datasets (for example, datasets that miss certain regions of the feature space, over-represent common outcomes, or exclude rare but important cases) and measure how these choices affect learning performance.

Depending on the student’s interests, the project can focus on classification tasks, regression tasks (predicting continuous values), or both. We will start off with a couple of sufficiently large but manageable datasets to simulate active learning cycles under different sampling strategies for the initial sample.

 

Supervisors

Primary Supervisor: Katharina Dost

 
Key qualifications and skills

- Background in data science, statistics, or machine learning

- Comfortable working in Python

- Prior knowledge of advanced machine-learning research topics is not required

- Passionate about research and data

 
Does the project come with funding

No - Student must be self-funded

 

Final date for receiving applications

Ongoing

 
How to apply

Apply by email to primary supervisor with CV, transcript, and motivation letter

 

Keywords

machine learning; AI; active learning

Privacy Preferences

By clicking "Accept All Cookies", you agree to the storing of cookies on your device to enhance site navigation, analyse site usage, and assist in our marketing efforts.