Master Theses Topics – 2025/26 – Machine Learning Group

MLG proposes the following MA thesis topics for this academic year. NB: Number of topics is limited. If interested please contact the supervisor asap.

Partial Domain Adaptation for Hand Gesture Recognition

Supervisors: Gianluca Bontempi, Martin Colot

This thesis focuses on implementing machine learning methods for recognizing hand gestures from electromyographic (EMG) signals in a cross-subject configuration. Unsupervised domain adaptation is a great tool to mitigate the effects of subject-specific variations in EMG by transferring knowledge from labeled source domains to an unlabeled target domain (the test user). However, when the test user does not perform all the expected gestures, global domain adaptation can fail. This work will explore partial domain adaptation (PDA), a technique that improves adaptation when the target samples represent only a subset of the label space.

The student should be an expert in Python programming, registered in the MA module on Computational Intelligence, proficient in Machine Learning, and passionate about interdisciplinary applied research.

Industry-Ready Tools & Resources:

Fast variable selection without shrinkage

Supervisor: Maarten Jansen

Selecting optimal models from broad non-nested model spectrums can be driven by criteria balancing good training set prediction and model complexity. Optimization over variable numbers is combinatorially complex and not feasible for high-dimensional data. This problem can be approximated by replacing the counting measure with a sum of magnitude estimators, changing a combinatorial problem into a convex, quadratic programming one.

This thesis applies variable selection in sparse inverse problems, or in deblurring and denoising images, using gradient projection or iterative thresholding.

Methods for omics data clustering

Supervisor: Matthieu Defrance

Clustering analysis is routinely performed on omics data to explore or discover underlying cell identities. The high dimensionality and significant sparsity of these data (with false zero count observations) make clustering computationally challenging. This project studies state-of-the-art techniques for omics data clustering, emphasizing neural network approaches for initial data embedding.

Reference: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-021-04210-8

Contact: Matthieu Defrance (matthieu.defrance@ulb.be)

⛔️NO LONGER AVAILABLE⛔️ Cloud-Native MLOps for Real-Time Digital Twins

(NO LONGER AVAILABLE)
Supervisors: Gianluca Bontempi, Gian Marco Paldino

Digital Twins (DTs) are virtual replicas of physical systems that are becoming essential in Industry 4.0 for monitoring, simulation, and optimization. Machine Learning Operations (MLOps) bridges the gap between laboratory ML models and robust, scalable, and maintainable models required for live DT environments. To achieve the scalability and real-time responsiveness required by modern DTs, MLOps is best implemented on a cloud platform.

This thesis will address the practical challenges of operationalizing ML models for Digital Twins using a cloud-native approach on e.g. Amazon Web Services (AWS). The student will explore, design, and implement a complete, automated MLOps pipeline on AWS for a DT in either the renewable energy or traffic simulation domain. The core of this thesis is not just to build a model, but to build the industrial-grade, cloud-native infrastructure around it, covering data ingestion (e.g. Kinesis, S3), CI/CD (e.g. CodePipeline), model training and deployment (e.g. Amazon SageMaker), and monitoring (e.g. CloudWatch).

The ideal candidate will be passionate about bridging the gap between academic research and real-world application. Required skills include expert proficiency in Python, a strong foundation in Machine Learning, and a keen interest in software engineering, automation, and cloud technologies (experience with Docker or AWS is a major plus). Registration in the MA computational intelligence module is required.

Industry-Ready Tools & Resources:

(NO LONGER AVAILABLE)

⛔️NO LONGER AVAILABLE⛔️ Machine Learning for Causal Discovery

Supervisors: Gianluca Bontempi and Gian Marco Paldino

This thesis focuses on designing and implementing machine learning methods for probability distribution classification to discover causal directionality from data.

The student should be an expert in R and Python programming, registered in the MA module on computational intelligence, proficient in Machine Learning, and passionate about interdisciplinary applied research.

References:

https://link.springer.com/article/10.1007/s10115-021-01621-0
CauseMe platform
(NO LONGER AVAILABLE)

⛔️NO LONGER AVAILABLE⛔️Frugal Machine Learning for Sustainable Renewable Energy Systems

(NO LONGER AVAILABLE)
Supervisors: Gianluca Bontempi, Gian Marco Paldino

The increasing complexity of machine learning (ML) models has led to a significant rise in their computational and energy demands. This “Red AI” trend poses a challenge to the sustainable development of artificial intelligence. In response, the field of “Frugal Machine Learning” or “Green AI” has emerged, focusing on the creation of ML models that are not only accurate but also efficient in terms of computational resources, energy consumption, and carbon footprint. This is particularly relevant in the renewable energy sector, where ML is crucial for tasks like forecasting energy production and demand to ensure grid stability.

This thesis aims to explore the intersection of frugal machine learning and renewable energy. The core objective is to investigate and apply state-of-the-art techniques for measuring and reducing the carbon footprint of ML models used in the context of renewable energy forecasting. The student will conduct a comprehensive review of frugal ML methodologies and will practically apply and benchmark tools designed to estimate the CO2 emissions of computation, such as the Python package `CodeCarbon` and the `Green Algorithms calculator`. A key innovative aspect of this thesis will be to design and implement a scheduler that intelligently times the training of ML models to coincide with periods of peak renewable energy production, thereby minimizing the reliance on fossil fuel-based energy sources.

The student should have a strong background in Machine Learning and Python programming, an interest in interdisciplinary research, particularly in the application of AI to sustainability and energy systems, and be proactive and capable of working independently. Registration in the MA module on computational intelligence is preferred.

References:

(NO LONGER AVAILABLE)