Topic

Quantifying Initial-Sample Sensitivity in Active Learning

22 May 2026

Host Faculty: Engineering

General Subject Area: Machine Learning

Project Level: Master's

HOW TO APPLY

Overall Goal:

-----------------

To investigate how the initial choice of training data affects the performance and reliability of active learning methods, and to identify which methods are more robust when early data are biased or unrepresentative.

Details:

---------

Machine-learning models often start with only a small amount of labeled data, which is then expanded gradually as new data are collected. If the new data is collected in a strategic manner to best inform the model, this process is called “active learning”. In practice, the very first samples used for training can strongly influence the success of active learning methods and the model itself. If these early samples are biased, incomplete, or unrepresentative, the model may perform poorly, even if more data are added later.

This project studies how sensitive different data-efficient learning strategies are to their initial training data. The student will systematically construct different types of “problematic” initial datasets (for example, datasets that miss certain regions of the feature space, over-represent common outcomes, or exclude rare but important cases) and measure how these choices affect learning performance.

Depending on the student’s interests, the project can focus on classification tasks, regression tasks (predicting continuous values), or both. We will start off with a couple of sufficiently large but manageable datasets to simulate active learning cycles under different sampling strategies for the initial sample.

Supervisors

Primary Supervisor: Katharina Dost

Key qualifications and skills

- Background in data science, statistics, or machine learning

- Comfortable working in Python

- Prior knowledge of advanced machine-learning research topics is not required

- Passionate about research and data

Does the project come with funding

No - Student must be self-funded

Final date for receiving applications

Ongoing

How to apply

Apply by email to primary supervisor with CV, transcript, and motivation letter

Keywords

machine learning; AI; active learning

Menu

About UC

Mō UC

About UC

Campus & community

Connect with us

Study

Ako

Study information

Our faculties

Study options

Life

Te Ao o UC

Student life

Support

Resources

Research

Rangahau

Commitments

Capabilities

Resources & facilities

News and Events

Rongo o te Wā

Latest News

Events

Quantifying Initial-Sample Sensitivity in Active Learning

Supervisors

Key qualifications and skills

Does the project come with funding

Final date for receiving applications

How to apply

Keywords

Katharina Dost

05 November 2025

Katharina Dost

Katharina Dost

Connect with us

Find it

Do it

Study it

Tools

Latest

General enquiries

International enquiries

More contact details

Emergency contact details