CARLA is a simulator for self-driving cars. The (ambitious) goal of the MA thesis is to learn the utility function of a driver in order to inject it in a self-driving agent. This is supposed to be done by observing the decisions of a driver and combining her decisions with current and expected future scenarios. Notions of utility theory, prediction and decision making will be required. together with very good programming skills.
The student should be “highly motivated”, expert in Python programming, interested in Data Science, statistical aspects of Machine Learning and registered at the Computational Intelligence module of the Master. The student is expected also to be acknowledged (or at least motivated to learn rapidly) with the python gym environment and CARLA simulator.
TinyML, is about the use of machine learning in embedded systems to implement intelligent functionalities in tiny devices.
The student should be expert in Python programming, interested in Data Science, statistical aspects of Machine Learning and registered at the Computational Intelligence module of the Master. The student is expected also to be acknowledged (or at least motivated to learn rapidly) with the Arduino Platform
The student should be expert in Python programming, interested in Machine Learning and registered at the Computational Intelligence module of the Master. The student is expected to interact with MLG researchers working on applied machine learning projects in collaboration with companies.
Learning from imbalanced datasets is an important issue in a lot of practical classification tasks where one class of interest (e.g. fraud, anomaly) occurs at a much lower rate than the other (e.g. normal behaviour). This MA thesis will focus on the adoption of Generative Deep Learning to deal with pattern classification in business data (fraud detection, churn detection).
The student should be expert in Python programming, interested in Deep Learning, statistical aspects of Machine Learning and registered at the Computational Intelligence module of the Master. The student is expected to interact with MLG researchers working on applied machine learning projects in collaboration with companies.
This MA thesis will take place in the context of a collaboration between MLG and the Laboratory of Neurophysiology and Movement Biomechanics (LNMB). An electroencephalogram (EEG) uses multiple electrodes to measure the electrical activity of post-synaptic potentials of cortical neurons located at specific parts of the brain. LNMB is composed of several researchers who developed a solid expertise in EEG signal acquisition and analysis. Over the years they acquired a large amount of EEG data from different domains (NASA astronauts in the ISS, hockey players from the national Belgian hockey team, tennis players from the Justine Henin Academy, children and adults with hyperactivity disorder…) and for various applications (brain-computer interface, increase human performance, diagnostic tool…).
The objective of the MA thesis is to work with cutting-edge technology and use state-of-the-art signal processing and Machine Learning techniques on EEG data.
The work will focus on i) exploring different EEG datasets ii) extracting relevant features from the brain state that may not be directly visible with standard EEG analysis iii) deploying different classification models to reach or improve state-of-the-art results.
The student should be expert in Python programming, registered at the MA module on computational intelligence, have a passion for interdisciplinary research and be available to visit frequently the Erasme lab.
– MNE : Gramfort, M. Luessi, E. Larson, D. Engemann, D. Strohmeier, C. Brodbeck, L. Parkkonen, M. Hämäläinen, MNE software for processing MEG and EEG data, NeuroImage, Volume 86, 1 February 2014, Pages 446-460, ISSN 1053-8119
– EEG Lab : A Delorme & S Makeig (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics (pdf, 0.7 MB) Journal of Neuroscience Methods 134:9-21
PS. For students that are interested, the LNMB also offers the possibility of an internship compatible with the TRAN-F-501 course available in the MA Computer Science cursus.
The selection of an optimal model from a broad spectrum of non-nested models can be driven by a criterium that balances a good prediction of the training set and complexity of the model, that is, the number of selected variables. Optimization over a number of variables, or even comparison of models with a given number of variables is a problem of combinatorial complexity, and thus not feasible in the context of high-dimensional data. Part of the problem can be well approximated by changing the number of selected variables in the criterium by the sum of absolute values of the estimators of these variables within the selected model. The counting measure is replaced by a sum of magnitudes, thus changing a combinatorial problem into convex, quadratic programming problem. This problem can be solved by a wide range of algorithms, including direct methods, such as least angle regression, or iterative methods, such as iterative thresholding or gradient projection. Moreover, for a fixed value of model complexity, the relaxed problem selects approximately the same model as the original combinatorial one. This is no longer the case when the model complexity is part of the optimization problem, but a correction for the divergence between the combinatorial and quadratic problem can be established. The thesis is about the application of the variable selection in sparse inverse problems, or in deblurring and denoising images, using gradient projection or iterative thresholding.
Feature selection is a crucial step in any machine learning pipeline. However, most feature selection methods do not attempt to uncover causal relationships between feature and target and focus instead on making best predictions. The MA thesis will focus on:
⋅ A review and comparative assessment of existing causal feature selection algorithms, including the methods developed at MLG
⋅ The design of a validation strategy of those techniques on real datasets (e.g. ChaLearn competition datasets, other datasets)
The student should be interested in statistical aspects of Machine Learning and registered at the Computational Intelligence module of the Master.
Biomarkers (e.g. epigenetic, expression) can be used to monitor alterations that are occurring at the cellular level in a given organism. One challenging task is to identify a restricted set of markers (e.g. genes) that allow an accurate estimation of the monitored properties. The main objective of this project is to evaluate the influence of noise and missing measurements on the prediction accuracy. To that aim, next generation sequencing data (RNA-seq, RRBS) will be used to explore real case settings.
High-throughput sequencing and genome-wide analyses have profoundly impacted the genetic diagnostic of rare diseases. Beside the classical genetic variants calling that target alterations of the DNA sequence itself, a new field of methods based on epigenetic (at the DNA level) or transcriptomic (at the RNA level) alterations has emerged. The objective of the project is to develop and evaluate supervised classification methods applied to rare diseases classification.
Reference: Erfan Aref-Eshghi et al. Evaluation of DNA Methylation Episignatures for Diagnosis and Phenotype Correlations in 42 Mendelian Neurodevelopmental Disorders. The American Journal of Human Genetics, Volume 106, Issue 3, 2020.