Le Machine Learning Group propose pour l'année 2016/2017 une dizaine de sujets pour les étudiants en master. Les domaines d'applications incluent le calcul à haute performance, la bioinformatique, les réseaux de capteurs, l'évolution artificielle, la médecine assistée par ordinateur, les protéines artificielles et la dynamique des réseaux.

NB:** Le nombre de sujets est limité. Les étudiants intéressés sont priés de se manifester au plus tôt.**

**1. ****Machine learning on big data (Gianluca Bontempi, Catharina Olsen,**** Yann-A****ël ****Le Borgn****e)**

**4. Big data stream management and performance optimization (Gianluca Bontempi, Fabrizio Carcillo****, Yann A****ë****l Le Borgne****)**

**5. Machine learning on multi-omics data integration (Antonio Colaprico, Gianluca Bontempi)**

**6. **Modeling strategic behavior in multi-agent systems using Deep Learning** (Tom Lenaerts, Elias Fernandez)**

**7. Data mining of geographical mobility data (Gianluca Bontempi, ****Yann A****ë****l Le Borgne****)**

**8. **Learning Dynamics in Social Dilemmas that arise in Multi-Agent Computer Games **(Tom Lenaerts, Elias Fernandez)**

The collection of gigantic datasets in several domains (e.g. social networks, finance, internet) and the need to extract useful information from them asks for the development of new and effective techniques to store and mine very large data structures. The Master thesis will focus on methods to scale up and make parallel machine learning algorithms in order to deal effectively with very large and distributed databases (e.g. Hadoop, Spark/ML Lib). The objective of the thesis is to design and setup a running distributed system (based on existing open-source solutions) to store and analyze huge datasets.

Required competences; machine learning, computational statistics, programming.

Useful links:

- Mahout and Apache
- Machine learning and Hadoop
- Machine learning and Hadoop guide
- R and Hadoop
- R and Hadoop tutorial
- R, Apache and Hadoop
- Spark

Multiscale or multiresolution analysis is a technique for the analysis and processing of data in a telescopic way. That means that the data is decomposed into a reperesentation that separates global, large scale features from small scale details, with a broad spectrum in between. In that sense, multiscale is related to a frequency (Fourier) analysis (with slowly and fast oscillating components), but, unlike a Fourier transform, a multiscale analysis keeps information on the location in the original time or space domain.

The most well known example of a multiscale analysis is a wavelet decomposition. Wavelets are particularly popular in image processing, for instance in the JPEG compression standard. This thesis investigates the use of an other algorithm for a multiresolution, known as a Laplacian pyramid levitrakamagra.net. This Laplacian pyramid is a slightly overcomplete transform, meaning that it maps n data onto 2n coefficients in the multiscale representation. It can be implemented as an overcomplete version of a lifting scheme, which is a fast implementation of the wavelet transform.

In this thesis, the Laplacian pyramid is equiped with a local polynomial smoothing technique, popular in statistics. The objective is to investigate the properties a Laplacian pyramid with local polynomial smoothing in applications of image processing (denoising, compression).

The selection of an optimal model from a broad spectrum of non-nested models can be driven by a criterium that balances a good prediction of the training set and complexity of the model, that is, the number of selected variables. Optimization over a number of variables, or even comparison of models with a given number of variables is a problem of combinatorial complexity, and thus not feasible in the context of high-dimensional data. Part of the problem can be well approximated by changing the number of selected variables in the criterium by the sum of absolute values of the estimators of these variables within the selected model. The counting measure is replaced by a sum of magnitudes, thus changing a combinatorial problem into convex, quadratic programming problem. This problem can be solved by a wide range of algorithms, including direct methods, such as least angle regression, or iterative methods, such as iterative thresholding or gradient projection. Moreover, for a fixed value of model complexity, the relaxed problem selects approximately the same model as the original combinatorial one. This is no longer the case when the model complexity is part of the optimization problem, but a correction for the divergence between the combinatorial and quadratic problem can be established. The thesis is about the application of the variable selection in sparse inverse problems, or in deblurring and denoising images, using gradient projection or iterative thresholding.

In the scope of the BruFence project (http://mlg.ulb.ac.be/BruFence), we are looking for a master student to work for his thesis on the topic of “Transactional stream management and performance optimization”. The aim of the BruFence project is to design systems based on machine learning and big data mining techniques that allow sensible and secure systems to automatically detect frauds in large amount of transactions (research sponsored by INNOVIRIS in collaboration with Worldline and Nviso). The student will be involved in the development of the stream processing system and will be required to develop the application in Flume, Spark Streaming and Cassandra. Knowledge of Unix system and basic notion of computer programming is required; knowledge of Hadoop, Spark, Java or Scala is a definite plus. Please note that more and more company are recruiting people skilled in Big Data technologies.

Contacts: Gianluca Bontempi (gbonte@ulb.ac.be) and Fabrizio Carcillo

Useful links:

- https://flume.apache.org/
- http://cassandra.apache.org/
- https://spark.apache.org/streaming/
- https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x
- http://shop.oreilly.com/product/0636920028512.do

Recent results obtained in multi-omics data integration strongly suggest that combining different

levels of molecular information is a powerful tool for characterizing cancer.

Among the already proposed integrative methodologies, networks proved to be very effective given

their system-level modelling of disease mechanisms. Nevertheless, further effort is still needed to

find the optimal combination of data types, able to maximize our knowledge about disease onset and

progression without being redundant.

The objective of the thesis is to design and setup a R/Bioconductor package working with integration of multi-omics cancer data able to analyze huge datasets by means of machine learning technics.

The master student can re-use and improve the vignette of our new R/Bioconductor package TCGAbiolinks.

See following link for the vignette with some case studies and examples.

TCGAbiolinks offer bioinformatics solutions by using a guided workflow to allow users to query,

download and perform integrative analyses of TCGA data.

It combined methods from computer science and statistics into the pipeline and incorporated

methodologies developed in previous TCGA marker studies and in our own group.

TCGAbiolinks downstream analysis can be divided into 1) supervised analysis, comprising differential expression analysis, enrichment analysis, and master regulator analysis or 2) unsupervised analysis,: comprising inference of gene regulatory network, cluster, classification, ROC, AUC, feature selection, and survival analysis.

Required skills: Machine learning, statistical analysis, programming skills (R is an advantage), passion for interdisciplinary research.

Useful links:

https://www.bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/tcgaBiolinks.html

“Game theory provides a powerful framework for the design and analysis of multi-agent systems that involve strategic interactions” [1]. However, the traditional game theoretic approach that assumes perfect rationality and selfish individuals doesn’t perform well in many situations (e.g. when there are time constraints, the environment is not fully observable or agents are less sophisticated). Behavioral game theory has proposed several ways to overcome these constraints by substituting rationality with predictive models of human behavior that incorporate insights from cognitive psychology [2]. Yet, these models also suffer from lack of flexibility and fail to make accurate predictions of human behavior in many scenarios. Nonetheless, the recent advancements on Machine Learning, and in concrete, Deep Learning (DL) [3]–[5] highlight their importance and validity as strong predictive modeling tools. Moreover, the combination of DL with Reinforcement Learning (RL) has already proven effective in strategic gaming [6], as well as in the prediction of human behavior from experiments [1], improving over the previous state-of-art.

The flexibility and power of neural networks, and their capacity of abstraction and creation of novel representations within the different layers of a deep network, call for optimism that this approach will help advance Game Theory and vice-versa. Nevertheless, there are still several open questions. For instance, in [1] even though the model achieves very good predictive capabilities, it seems to overfit the data once it includes more complex iterative reasoning. This type of reasoning is associated to humans’ capacity to anticipate and to reflect over the consequences of their actions, which is why it is important to study how it can integrate over a DL approach.

Goal: In order to specifically analyze the effects of iterative reasoning on the prediction of human behavior, we propose that, as a first step, the student apply the state-of-art presented in [1] to the Anticipation Game [7], a modified version of the Dictator Game, particularly designed to reflect the influence of reputation and anticipation over human behavior. In this gift-giving game, two groups of players are matched pairwise: dictators and receivers. At the beginning of the game receivers must decide whether to play with a given dictator based on her history of actions. In case of acceptance, the dictator receives a certain endowment and has to share it with the receiver however she wants. Otherwise, both players receive zero payoff hop over to this web-site. Here, the capacity to anticipate the effects of her actions is crucial for the dictator.

Keywords: Deep Learning; Predictive Modelling; Game Theory; Behavioral Game Theory; Anticipation Game; Iterative Reasoning.

References:

[1] J. Hartford, J. R. Wright, and K. Leyton-brown, “Deep Learning for Predicting Human Strategic Behavior,” no. Nips, pp. 1–9, 2016.

[2] C. Camerer, “Behavioral Game Theory: Experiments in Strategic Interaction,” *Insights Decis. Mak. A Tribut. to Hillel J. Einhorn*, p. 544, 2003.

[3] M. Hausknecht and P. Stone, “Deep Recurrent Q-Learning for Partially Observable MDPs,” *arXiv Prepr. arXiv1507.06527*, 2015.

[4] A. A. Rusu, S. Gomez Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell, “Policy Distillation,” *arXiv*, pp. 1–12, 2015.

[5] H. van Hasselt, A. Guez, and D. Silver, “Deep Reinforcement Learning with Double Q-learning,” *arXiv1509.06461 [cs]*, 2015.

[6] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” *Nature*, vol. 529, no. 7587, pp. 484–489, 2016.

[7] I. Zisis, S. Di Guida, T. A. Han, G. Kirchsteiger, and T. Lenaerts, “Generosity motivated by acceptance - evolutionary analysis of an anticipation game,” *Sci. Rep.*, vol. 5, p. 18076, 2015.

Mobility is an aspect of growing relevance in our daily lives. It acts as 'the economy's backbone' by supporting other sectors throughout the economic system. Many studies, e.g. IBM smart cities study in Brussels, have shown that Brussels is lagging behind compared to other capital cities [1]. In this context, the goals of this Master thesis will be (i) the study of spatio-temporal statistical and machine learning techniques for the analysis of geographical data [3][5], (ii) the survey (in collaboration with a BA3 student) of mobility data and mobility indicators available as Open Data in Brussels Region (for example from the SNCB, STIB, Villo, etc ...) and (iii) a comparison with Amsterdam, which is recognised as having one of the best Open Data Programme for transport and mobility [2]. Through this topic, the student will have the opportunity to gain skills in data processing, machine learning, and interactive Web development using Shiny/R. The student will also have opportunities to collaborate with the MOBI team of the VUB in the context of the MOBI-Aid (Brussels MOBI-AID : Brussels MOBIlity Advanced Indicators Dashboard [4] ) research project.

References:

[1] Brussels-Capital Region, Belgium. Smarter Cities Challeng REport - IBM, 2015. hSps://www.ibm.com/multimedia/portal/ V837502Y37964J52/50224_SCC_Brussels_Report_LR.pdf

[2] European Data Portal. Analytical Report 4: Open Data in Cities, 2016. hSp://www.europeandataportal.eu/sites/ default/files/edp_analytical_report_n4_- _open_data_in_cities_v1.0_final.pdf

[3] http://www.sciencedirect.com/science/article/pii/S1877042811014388

[4] http://mlg.ulb.ac.be/node/810

Cooperation, collaboration, or trust, are some of the elements that characterize social relations. Their presence in human societies go as far back as to when the first groups of humans appeared – cooperation and coordination was essential to hunt animals that were bigger and stronger than any single human. However, very often discord and dishonesty arise in social interactions, as the individual temptation to cheat (defect) might seem to exceed the social benefit of cooperation. In game theory, social dilemmas are used to represent this conflict between individual and collective interests.

One of the most common examples of social dilemmas is the Prisoner’s Dilemma (PD), which can be characterized as a mixed-motive two-person game with two choices – defect or cooperate [1], in which the payoff of each player can be defined as a 2x2 matrix. One of the main particularities of this game is the fact the its only pure Nash equilibria is to defect (cite). For this reason, extensive literature has focused on trying to find and understand under which conditions and mechanisms may cooperation emerge [2]–[7]. On this project though, we shift the focus towards the equilibrium that arise from imperfect adaptive players that have the capacity to learn. In other words, we want to understand how learning dynamics affect the outcome of social dilemmas.

Up to now, most studies over the dynamics of learning in social dilemmas have analyzed basic reinforcement learning models [1], [8]–[11]. However, the recent achievements of reinforcement learning used in combination with Deep Learning [12], [13] invite for new studies to analyze the effects of these new algorithms on social interactions. Moreover, the capacity of these new models to perform correctly in complex environments, like multi-player computer games allow us to place our study in more realistic scenarios. In [14], the authors propose a framework that maps temporally-extended Markov games to social dilemmas, so that situations of collective conflict in multi-agent games, can be analyzed from the perspective of matrix games like the PD. Afterwards, they study the learning dynamics of agents that use deep reinforcement learning to learn and make decisions. However, the agents don’t implement any type of recursive reasoning about one another’s learning [15]–[17], a characteristic that has been commonly attributed to humans. Hence, we here raise the question of how iterative reasoning and anticipation might affect the different equilibria that arise from the agents’ interactions.Goal: The main goal of this project is to understand what social consequences emerge when artificial agents make decisions based on particular learning rules [14]. Therefore, we propose that the student at first reproduce the results of [14] and then compare how they differ from a model that uses iterative reasoning or anticipation to plan its actions based on a inferred knowledge of the opponent.

Keywords: Game Theory; Social Dilemmas; Prisoner’s Dilemma; Multi-agent Games; Deep Reinforcement Learning; Predictive Modelling; Iterative reasoning; Anticipation.

References:

[1] M. W. Macy and A. Flache, “Learning dynamics in social dilemmas.,” *Proc. Natl. Acad. Sci. U. S. A.*, vol. 99 Suppl 3, pp. 7229–36, 2002.

[2] F. C. Santos and J. M. Pacheco, “Scale-free networks provide a unifying framework for the emergence of cooperation,” *Phys. Rev. Lett.*, vol. 95, no. 9, pp. 1–4, 2005.

[3] F. C. Santos, F. L. Pinheiro, T. Lenaerts, and J. M. Pacheco, “The role of diversity in the evolution of cooperation,” *J. Theor. Biol.*, vol. 299, pp. 88–96, 2012.

[4] J. H. Fowler, “Altruistic punishment and the origin of cooperation.,” *Proc. Natl. Acad. Sci. U. S. A.*, vol. 102, no. 19, pp. 7047–9, 2005.

[5] Z. G. Epstein, A. Peysakhovich, and D. G. Rand, “The good, the bad, and the unflinchingly selfish: Cooperative decision-making can be predicted with high accuracy using only three behavioral types,” pp. 1–11, 2016.

[6] F. C. Santos, J. M. Pacheco, T. Lenaerts, and J. Lenaerts, “Evolutionary dynamics of social dilemmas in structured heterogeneous populations,” *Proc. Natl. Acad. Sci. U. S. A.*, vol. 103, no. 9, pp. 3490–3494, 2006.

[7] W. H. Press and F. J. Dyson, “Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent.,” *Proc. Natl. Acad. Sci. U. S. A.*, vol. 109, no. 26, pp. 10409–13, 2012.

[8] L. R. Izquierdo and S. S. Izquierdo, “Dynamics of the Bush-Mosteller Learning Algorithm in 2x2 Games,” *Reinf. Learn. Theory Appl.*, vol. 5, no. January, pp. 199–224, 2008.

[9] R. Bush and F. Mosteller, “A stochastic model with applications to learning,” *Ann. Math. Stat.*, pp. 559–585, 1953.

[10] G. Cimini and A. Sánchez, “Learning dynamics explains human behaviour in prisoner’s dilemma on networks.,” *J. R. Soc. Interface*, vol. 11, no. 94, p. 20131186, 2014.

[11] T. Ezaki, Y. Horita, M. Takezawa, and N. Masuda, “Reinforcement Learning Explains Cooperation and Its Moody Cousin,” *PLoS Comput. Biol.*, no. July 20, pp. 1–13, 2016.

[12] D. Silver *et al.*, “Mastering the game of Go with deep neural networks and tree search,” *Nature*, vol. 529, no. 7587, pp. 484–489, 2016.

[13] V. Mnih *et al.*, “Playing Atari with Deep Reinforcement Learning,” *arXiv Prepr. arXiv …*, pp. 1–9, 2013.

[14] J. Z. L. (DeepMind), V. Z. (DeepMind), M. L. (DeepMind), J. M. (DeepMind), and T. G. (DeepMind), “Multi-agent Reinforcement learning in Sequential Social Dilemmas,” pp. 372–387, 2017.

[15] E. F. Domingos, J. C. Burguillo-rial, and T. Lenaerts, “Reactive Versus Anticipative Decision-Making in a Novel Gift-Giving Game,” in *31st AAAI Conference on Artifitial Intelligence*, 2017.

[16] J. Heinrich, M. Lanctot, D. Silver, and D. G. Com, “Fictitious Self-Play in Extensive-Form Games,” vol. 37, 2015.

[17] R. Rosen, “Anticipatory Systems,” *IFRS Int. Ser. Syst. Sci. Eng.*, vol. 1, pp. 313–370, 2012.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer