All publications
An overview from Di-fusion. Find our current PHD theses subjects here.
An overview from Di-fusion. Find our current PHD theses subjects here.
2018 |
Carcillo, Fabrizio 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/272119b, title = {Beyond Supervised Learning in Credit Card Fraud Detection: A Dive into Semi-supervised and Distributed Learning}, author = {Fabrizio Carcillo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/272119/5/ContratDiCarcillo.pdf}, year = {2018}, date = {2018-01-01}, abstract = {The expansion of the electronic commerce, as well as the increasing confidence of customers in electronic payments, makes of fraud detection a critical issue. The design of a prompt and accurate Fraud Detection System is a priority for many organizations in the business of credit cards. In this thesis we present a series of studies to increase the precision and the speed of fraud detection system. The thesis has three main contributions. The first concerns the integration of unsupervised techniques and supervised classifiers. We proposed several approaches to integrate outlier scores in the detection process and we found that the accuracy of a conventional classifier may be improved when information about the input distribution is used to augment the training set.The second contribution concerns the role of active learning in Fraud Detection. We have extensively compared several state-of-the-art techniques and found that Stochastic Semi-supervised Learning is a convenient approach to tackle the Selection Bias problem in the active learning process.The third contribution of the thesis is the design, implementation and assessment of SCARFF, an original framework for near real-time Streaming Fraud Detection. This framework integrates Big Data technology (notably tools like Kafka, Spark and Cassandra) with a machine learning approach to deal with imbalance, non-stationarity and feedback latency in a scalable manner. Experimental results on a massive dataset of real credit card transactions have showed that our framework is scalable, efficient and accurate over a big stream of transactions.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The expansion of the electronic commerce, as well as the increasing confidence of customers in electronic payments, makes of fraud detection a critical issue. The design of a prompt and accurate Fraud Detection System is a priority for many organizations in the business of credit cards. In this thesis we present a series of studies to increase the precision and the speed of fraud detection system. The thesis has three main contributions. The first concerns the integration of unsupervised techniques and supervised classifiers. We proposed several approaches to integrate outlier scores in the detection process and we found that the accuracy of a conventional classifier may be improved when information about the input distribution is used to augment the training set.The second contribution concerns the role of active learning in Fraud Detection. We have extensively compared several state-of-the-art techniques and found that Stochastic Semi-supervised Learning is a convenient approach to tackle the Selection Bias problem in the active learning process.The third contribution of the thesis is the design, implementation and assessment of SCARFF, an original framework for near real-time Streaming Fraud Detection. This framework integrates Big Data technology (notably tools like Kafka, Spark and Cassandra) with a machine learning approach to deal with imbalance, non-stationarity and feedback latency in a scalable manner. Experimental results on a massive dataset of real credit card transactions have showed that our framework is scalable, efficient and accurate over a big stream of transactions. |
`e, Nathaniel Mon P; Lenaerts, Tom; Pacheco, Jorge M J M; Dingli, David Evolutionary Dynamics of Paroxysmal Nocturnal Hemoglobinuria Journal Article In: PLoS computational biology, 2018, (Language of publication: en). @article{info:hdl:2013/267360, title = {Evolutionary Dynamics of Paroxysmal Nocturnal Hemoglobinuria}, author = {Nathaniel Mon P{`e}re and Tom Lenaerts and Jorge M J M Pacheco and David Dingli}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/267360/3/MonPereEtAlPLoSCB.docx}, year = {2018}, date = {2018-01-01}, journal = {PLoS computational biology}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Bizet, Martin Bioinformatic inference of a prognostic epigenetic signature of immunity in breast cancers PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/265092b, title = {Bioinformatic inference of a prognostic epigenetic signature of immunity in breast cancers}, author = {Martin Bizet}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/265092/7/ContratDiBizet.pdf}, year = {2018}, date = {2018-01-01}, abstract = {L’altération des marques épigénétiques est de plus en plus reconnue comme une caractéristique fondamentale des cancers. Dans cette th`ese, nous avons utilisé des profils de méthylation de l’ADN en vue d’améliorer la classification des patients atteints du cancer du sein gr^ace `a une approche basée sur l’apprentissage automatique. L’objectif `a long terme est le développement d’outils cliniques de médecine personnalisée. Les données de méthylation de l’ADN furent acquises `a l’aide d’une puce `a ADN dédiée `a la méthylation, appelée Infinium. Cette technologie est récente comparée, par exemple, aux puces d’expression génique et son prétraitement n’est pas encore standardisé. La premi`ere partie de cette th`ese fut donc consacrée `a l’évaluation des méthodes de normalisation par comparaison des données normalisées avec d’autres technologies (pyroséquenccage et RRBS) pour les deux technologies Infinium les plus récentes (450k et 850k). Nous avons également évalué la couverture de régions biologiquement relevantes (promoteurs et amplificateurs) par les deux technologies. Ensuite, nous avons utilisé les données Infinium (correctement prétraitées) pour développer un score, appelé MeTIL score, qui présente une valeur pronostique et prédictive dans les cancers du sein. Nous avons profité de la capacité de la méthylation de l’ADN `a refléter la composition cellulaire pour extraire une signature de méthylation (c’est-`a-dire un ensemble de positions de l’ADN o`u la méthylation varie) qui refl`ete la présence de lymphocytes dans l’échantillon tumoral. Apr`es une sélection de sites présentant une méthylation spécifique aux lymphocytes, nous avons développé une approche basée sur l’apprentissage automatique pour obtenir une signature d’une tailleoptimale réduite `a cinq sites permettant potentiellement une utilisation en clinique. Apr`es conversion de cette signature en un score, nous avons montré sa spécificité pour les lymphocytes `a l’aide de données externes et de simulations informatiques. Puis, nous avons montré la capacité du MeTIL score `a prédire la réponse `a la chimiothérapie ainsi que son pouvoir pronostique dans des cohortes indépendantes de cancer du sein et, m^eme, dans d’autres cancers.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } L’altération des marques épigénétiques est de plus en plus reconnue comme une caractéristique fondamentale des cancers. Dans cette th`ese, nous avons utilisé des profils de méthylation de l’ADN en vue d’améliorer la classification des patients atteints du cancer du sein gr^ace `a une approche basée sur l’apprentissage automatique. L’objectif `a long terme est le développement d’outils cliniques de médecine personnalisée. Les données de méthylation de l’ADN furent acquises `a l’aide d’une puce `a ADN dédiée `a la méthylation, appelée Infinium. Cette technologie est récente comparée, par exemple, aux puces d’expression génique et son prétraitement n’est pas encore standardisé. La premi`ere partie de cette th`ese fut donc consacrée `a l’évaluation des méthodes de normalisation par comparaison des données normalisées avec d’autres technologies (pyroséquenccage et RRBS) pour les deux technologies Infinium les plus récentes (450k et 850k). Nous avons également évalué la couverture de régions biologiquement relevantes (promoteurs et amplificateurs) par les deux technologies. Ensuite, nous avons utilisé les données Infinium (correctement prétraitées) pour développer un score, appelé MeTIL score, qui présente une valeur pronostique et prédictive dans les cancers du sein. Nous avons profité de la capacité de la méthylation de l’ADN `a refléter la composition cellulaire pour extraire une signature de méthylation (c’est-`a-dire un ensemble de positions de l’ADN o`u la méthylation varie) qui refl`ete la présence de lymphocytes dans l’échantillon tumoral. Apr`es une sélection de sites présentant une méthylation spécifique aux lymphocytes, nous avons développé une approche basée sur l’apprentissage automatique pour obtenir une signature d’une tailleoptimale réduite `a cinq sites permettant potentiellement une utilisation en clinique. Apr`es conversion de cette signature en un score, nous avons montré sa spécificité pour les lymphocytes `a l’aide de données externes et de simulations informatiques. Puis, nous avons montré la capacité du MeTIL score `a prédire la réponse `a la chimiothérapie ainsi que son pouvoir pronostique dans des cohortes indépendantes de cancer du sein et, m^eme, dans d’autres cancers. |
Raimondi, Daniele; Orlando, Gabriele; Tabaro, Francesco; Lenaerts, Tom; Rooman, Marianne; Moreau, Yves; Vranken, Wim F Large-scale in-silico statistical mutagenesis analysis sheds light on the deleteriousness landscape of the human proteome Journal Article In: Scientific reports, 2018, (Language of publication: en). @article{info:hdl:2013/273192, title = {Large-scale in-silico statistical mutagenesis analysis sheds light on the deleteriousness landscape of the human proteome}, author = {Daniele Raimondi and Gabriele Orlando and Francesco Tabaro and Tom Lenaerts and Marianne Rooman and Yves Moreau and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/273192}, year = {2018}, date = {2018-01-01}, journal = {Scientific reports}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Skiba, Grażyna; Starzec, Mateusz; Byrski, Aleksander; Rycerz, Katarzyna; Kisiel-Dorohinicki, Marek; Turek, Wojciech; Krzywicki, Daniel; Lenaerts, Tom; Burguillo, Juan Carlos Flexible asynchronous simulation of iterated prisoner's dilemma based on actor model Journal Article In: Simulation modelling practice and theory, 83 , pp. 75-92, 2018, (DOI: 10.1016/j.simpat.2017.12.010). @article{info:hdl:2013/272556, title = {Flexible asynchronous simulation of iterated prisoner's dilemma based on actor model}, author = {Grażyna Skiba and Mateusz Starzec and Aleksander Byrski and Katarzyna Rycerz and Marek Kisiel-Dorohinicki and Wojciech Turek and Daniel Krzywicki and Tom Lenaerts and Juan Carlos Burguillo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/272556/1/Elsevier_256183.pdf}, year = {2018}, date = {2018-01-01}, journal = {Simulation modelling practice and theory}, volume = {83}, pages = {75-92}, abstract = {The wide range of applications of the Iterated prisoner's dilemma (IPD) game made it a popular subject of study for the research community. As a consequence, numerous experiments have been conducted by researchers along the last decades. However, topics related with scaling simulation leveraging existing HPC infrastructure in the field of IPD did not always play a relevant role in such experimental work. The main contribution of this paper is a new simulation framework, based on asynchronous communication and its implementation oriented to distributed environments. Such framework is based on the modern Akka actor platform, that supports concurrent, distributed and resilient message-driven simulations; which are exemplified over the IPD game as a case study. We also present several interesting results regarding the introduction of asynchrony into the IPD simulation in order to obtain an efficient framework, so the whole simulation becomes scalable when using HPC facilities. The influence of asynchrony on the algorithm itself is also discussed, and the results show that it does not hamper the simulation.}, note = {DOI: 10.1016/j.simpat.2017.12.010}, keywords = {}, pubstate = {published}, tppubtype = {article} } The wide range of applications of the Iterated prisoner's dilemma (IPD) game made it a popular subject of study for the research community. As a consequence, numerous experiments have been conducted by researchers along the last decades. However, topics related with scaling simulation leveraging existing HPC infrastructure in the field of IPD did not always play a relevant role in such experimental work. The main contribution of this paper is a new simulation framework, based on asynchronous communication and its implementation oriented to distributed environments. Such framework is based on the modern Akka actor platform, that supports concurrent, distributed and resilient message-driven simulations; which are exemplified over the IPD game as a case study. We also present several interesting results regarding the introduction of asynchrony into the IPD simulation in order to obtain an efficient framework, so the whole simulation becomes scalable when using HPC facilities. The influence of asynchrony on the algorithm itself is also discussed, and the results show that it does not hamper the simulation. |
Kieken, Fabien; Loth, Karine; van Nuland, Nico N A J; Tompa, Peter; Lenaerts, Tom Chemical shift assignments of the partially deuterated Fyn SH2–SH3 domain Journal Article In: Biomolecular N M R Assignments, 12 (1), pp. 117-122, 2018, (DOI: 10.1007/s12104-017-9792-1). @article{info:hdl:2013/272541, title = {Chemical shift assignments of the partially deuterated Fyn SH2–SH3 domain}, author = {Fabien Kieken and Karine Loth and Nico N A J van Nuland and Peter Tompa and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/272541}, year = {2018}, date = {2018-01-01}, journal = {Biomolecular N M R Assignments}, volume = {12}, number = {1}, pages = {117-122}, abstract = {Src Homology 2 and 3 (SH2 and SH3) are two key protein interaction modules involved in regulating the activity of many proteins such as tyrosine kinases and phosphatases by respective recognition of phosphotyrosine and proline-rich regions. In the Src family kinases, the inactive state of the protein is the direct result of the interaction of the SH2 and the SH3 domain with intra-molecular regions, leading to a closed structure incompetent with substrate modification. Here, we report the 1H, 15N and 13C backbone- and side-chain chemical shift assignments of the partially deuterated Fyn SH3–SH2 domain and structural differences between tandem and single domains. The BMRB accession number is 27165.}, note = {DOI: 10.1007/s12104-017-9792-1}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src Homology 2 and 3 (SH2 and SH3) are two key protein interaction modules involved in regulating the activity of many proteins such as tyrosine kinases and phosphatases by respective recognition of phosphotyrosine and proline-rich regions. In the Src family kinases, the inactive state of the protein is the direct result of the interaction of the SH2 and the SH3 domain with intra-molecular regions, leading to a closed structure incompetent with substrate modification. Here, we report the 1H, 15N and 13C backbone- and side-chain chemical shift assignments of the partially deuterated Fyn SH3–SH2 domain and structural differences between tandem and single domains. The BMRB accession number is 27165. |
Byrski, Aleksander; Świderska, Ewelina; Łasisz, Jakub; Kisiel-Dorohinicki, Marek; Lenaerts, Tom; Samson, Dana; Indurkhya, Bipin Emergence of population structure in socio-cognitively inspired ant colony optimization Journal Article In: Computer Science, 19 (1), pp. 81-98, 2018, (DOI: 10.7494/csci.2018.19.1.2594). @article{info:hdl:2013/270586, title = {Emergence of population structure in socio-cognitively inspired ant colony optimization}, author = {Aleksander Byrski and Ewelina Świderska and Jakub Łasisz and Marek Kisiel-Dorohinicki and Tom Lenaerts and Dana Samson and Bipin Indurkhya}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/270586}, year = {2018}, date = {2018-01-01}, journal = {Computer Science}, volume = {19}, number = {1}, pages = {81-98}, abstract = {A metaheuristic proposed by us recently, Ant Colony Optimization (ACO) hybridized with socio-cognitive inspirations, turned out to generate interesting results when compared to classic ACO. Even though it does not always find better solutions to the considered problems, it usually finds sub-optimal solutions. Moreover, instead of a trial-and-error approach to configure the parameters of the ant species in the population, the actual structure of the population emerges from a predefined species-to-species ant migration strategies in our approach. Experimental results of our approach are compared to classic ACO and selected socio-cognitive versions of this algorithm.}, note = {DOI: 10.7494/csci.2018.19.1.2594}, keywords = {}, pubstate = {published}, tppubtype = {article} } A metaheuristic proposed by us recently, Ant Colony Optimization (ACO) hybridized with socio-cognitive inspirations, turned out to generate interesting results when compared to classic ACO. Even though it does not always find better solutions to the considered problems, it usually finds sub-optimal solutions. Moreover, instead of a trial-and-error approach to configure the parameters of the ant species in the population, the actual structure of the population emerges from a predefined species-to-species ant migration strategies in our approach. Experimental results of our approach are compared to classic ACO and selected socio-cognitive versions of this algorithm. |
de Bony, Eric James; Bizet, Martin; Grembergen, Olivier Van; Hassabi, Bouchra; Calonne, Emilie; Putmans, Pascale; Bontempi, Gianluca; cc, Fran Comprehensive identification of long noncoding RNAs in colorectal cancer Journal Article In: Oncotarget, 9 (45), pp. 27605-27629, 2018, (DOI: 10.18632/oncotarget.25218). @article{info:hdl:2013/278063, title = {Comprehensive identification of long noncoding RNAs in colorectal cancer}, author = {Eric James de Bony and Martin Bizet and Olivier Van Grembergen and Bouchra Hassabi and Emilie Calonne and Pascale Putmans and Gianluca Bontempi and Fran{cc}ois Fuks}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/278063}, year = {2018}, date = {2018-01-01}, journal = {Oncotarget}, volume = {9}, number = {45}, pages = {27605-27629}, abstract = {Colorectal cancer (CRC) is one of the most common cancers in humans and a leading cause of cancer-related deaths worldwide. As in the case of other cancers, CRC heterogeneity leads to a wide range of clinical outcomes and complicates therapy. Over the years, multiple factors have emerged as markers of CRC heterogeneity, improving tumor classification and selection of therapeutic strategies. Understanding the molecular mechanisms underlying this heterogeneity remains a major challenge. A considerable research effort is therefore devoted to identifying additional features of colorectal tumors, in order to better understand CRC etiology and to multiply therapeutic avenues. Recently, long noncoding RNAs (lncRNAs) have emerged as important players in physiological and pathological processes, including CRC. Here we looked for lncRNAs that might contribute to the various colorectal tumor phenotypes. We thus monitored the expression of 4898 lncRNA genes across 566 CRC samples and identified 282 lncRNAs reflecting CRC heterogeneity. We then inferred potential functions of these lncRNAs. Our results highlight lncRNAs that may participate in the major processes altered in distinct CRC cases, such as WNT/β-catenin and TGF-β signaling, immunity, the epithelial-to-mesenchymal transition (EMT), and angiogenesis. For several candidates, we provide experimental evidence supporting our functional predictions that they may be involved in the cell cycle or the EMT. Overall, our work identifies lncRNAs associated with key CRC characteristics and provides insights into their respective functions. Our findings constitute a further step towards understanding the contribution of lncRNAs to CRC heterogeneity. They may open new therapeutic opportunities.}, note = {DOI: 10.18632/oncotarget.25218}, keywords = {}, pubstate = {published}, tppubtype = {article} } Colorectal cancer (CRC) is one of the most common cancers in humans and a leading cause of cancer-related deaths worldwide. As in the case of other cancers, CRC heterogeneity leads to a wide range of clinical outcomes and complicates therapy. Over the years, multiple factors have emerged as markers of CRC heterogeneity, improving tumor classification and selection of therapeutic strategies. Understanding the molecular mechanisms underlying this heterogeneity remains a major challenge. A considerable research effort is therefore devoted to identifying additional features of colorectal tumors, in order to better understand CRC etiology and to multiply therapeutic avenues. Recently, long noncoding RNAs (lncRNAs) have emerged as important players in physiological and pathological processes, including CRC. Here we looked for lncRNAs that might contribute to the various colorectal tumor phenotypes. We thus monitored the expression of 4898 lncRNA genes across 566 CRC samples and identified 282 lncRNAs reflecting CRC heterogeneity. We then inferred potential functions of these lncRNAs. Our results highlight lncRNAs that may participate in the major processes altered in distinct CRC cases, such as WNT/β-catenin and TGF-β signaling, immunity, the epithelial-to-mesenchymal transition (EMT), and angiogenesis. For several candidates, we provide experimental evidence supporting our functional predictions that they may be involved in the cell cycle or the EMT. Overall, our work identifies lncRNAs associated with key CRC characteristics and provides insights into their respective functions. Our findings constitute a further step towards understanding the contribution of lncRNAs to CRC heterogeneity. They may open new therapeutic opportunities. |
Ioannidis, J P A; Bhattacharya, S; Evers, J L H; Veen, Der F V; Somigliana, E; Barratt, C L R; Bontempi, Gianluca; Baird, D T; Crosignani, P; Devroey, P; Diedrich, Klaus; Farquharson, R G; Fraser, L R; Geraedts, Joep Pm M; Gianaroli, Luca; Vecchia, La C; Magli, C; Negri, E; Sunde, A; Tapanainen, J S; Tarlatzis, Basil; Steirteghem, A V; Veiga, A Protect us from poor-quality medical research Journal Article In: Human reproduction, 33 (5), pp. 770-776, 2018, (DOI: 10.1093/humrep/dey056). @article{info:hdl:2013/272828, title = {Protect us from poor-quality medical research}, author = {J P A Ioannidis and S Bhattacharya and J L H Evers and F V Der Veen and E Somigliana and C L R Barratt and Gianluca Bontempi and D T Baird and P Crosignani and P Devroey and Klaus Diedrich and R G Farquharson and L R Fraser and Joep Pm M Geraedts and Luca Gianaroli and C La Vecchia and C Magli and E Negri and A Sunde and J S Tapanainen and Basil Tarlatzis and A V Steirteghem and A Veiga}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/272828}, year = {2018}, date = {2018-01-01}, journal = {Human reproduction}, volume = {33}, number = {5}, pages = {770-776}, abstract = {Much of the published medical research is apparently flawed, cannot be replicated and/or has limited or no utility. This article presents an overview of the current landscape of biomedical research, identifies problems associated with common study designs and considers potential solutions. Randomized clinical trials, observational studies, systematic reviews and meta-analyses are discussed in terms of their inherent limitations and potential ways of improving their conduct, analysis and reporting. The current emphasis on statistical significance needs to be replaced by sound design, transparency and willingness to share data with a clear commitment towards improving the quality and utility of clinical research.}, note = {DOI: 10.1093/humrep/dey056}, keywords = {}, pubstate = {published}, tppubtype = {article} } Much of the published medical research is apparently flawed, cannot be replicated and/or has limited or no utility. This article presents an overview of the current landscape of biomedical research, identifies problems associated with common study designs and considers potential solutions. Randomized clinical trials, observational studies, systematic reviews and meta-analyses are discussed in terms of their inherent limitations and potential ways of improving their conduct, analysis and reporting. The current emphasis on statistical significance needs to be replaced by sound design, transparency and willingness to share data with a clear commitment towards improving the quality and utility of clinical research. |
Trepo, Eric; Goossens, Nicolas; Fujiwara, Naoto; Song, Won-Min; Colaprico, Antonio; Marot, Astrid; Spahr, Laurent; Demetter, Pieter; Sempoux, Christine; Im, Gene Y; Saldarriaga, Joan; Gustot, Thierry; `e, Jacques Devi; Thung, Swan SN; Minsart, Charlotte; Serste, Thomas; Bontempi, Gianluca; Abdelrahman, Karim; Henrion, Jean; Degré, Delphine; Lucidi, Valerio; Rubbia-Brandt, Laura; Nair, Venugopalan D; Moreno, Christophe; Deltenre, Pierre; Hoshida, Yujin; Franchimont, Denis In: Gastroenterology, 154 (4), pp. 965-975, 2018, (DOI: 10.1053/j.gastro.2017.10.048). @article{info:hdl:2013/269084, title = {Combination of Gene Expression Signature and Model for End-Stage Liver Disease Score Predicts Survival of Patients With Severe Alcoholic Hepatitis}, author = {Eric Trepo and Nicolas Goossens and Naoto Fujiwara and Won-Min Song and Antonio Colaprico and Astrid Marot and Laurent Spahr and Pieter Demetter and Christine Sempoux and Gene Y Im and Joan Saldarriaga and Thierry Gustot and Jacques Devi{`e}re and Swan SN Thung and Charlotte Minsart and Thomas Serste and Gianluca Bontempi and Karim Abdelrahman and Jean Henrion and Delphine Degré and Valerio Lucidi and Laura Rubbia-Brandt and Venugopalan D Nair and Christophe Moreno and Pierre Deltenre and Yujin Hoshida and Denis Franchimont}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/269084}, year = {2018}, date = {2018-01-01}, journal = {Gastroenterology}, volume = {154}, number = {4}, pages = {965-975}, abstract = {Background & Aims: Patients with severe alcoholic hepatitis (AH) have a high risk of death within 90 days. Corticosteroids, which can cause severe adverse events, are the only treatment that increases short-term survival. It is a challenge to predict outcomes of patients with severe AH. Therefore, we developed a scoring system to predict patient survival, integrating baseline molecular and clinical variables. Methods: We obtained fixed liver biopsy samples from 71 consecutive patients diagnosed with severe AH and treated with corticosteroids from July 2006 through December 2013 in Brussels, Belgium (derivation cohort). Gene expression patterns were analyzed by microarrays and clinical data were collected for 180 days. We identified gene expression signatures and clinical data that are associated with survival without liver transplantation at 90 and 180 days after initiation of corticosteroid therapy. Findings were validated using liver biopsies from 48 consecutive patients with severe AH treated with corticosteroids, collected from March 2010 through February 2015 at hospitals in Belgium and Switzerland (validation cohort 1) and in liver biopsies from 20 patients (9 received corticosteroid treatment), collected from January 2012 through May 2015 in the United States (validation cohort 2). Results: We integrated data on expression patterns of 123 genes and the model for end-stage liver disease (MELD) scores to assign patients to groups with poor survival (29% survived 90 days and 26% survived 180 days) and good survival (76% survived 90 days and 65% survived 180 days) (P <.001) in the derivation cohort. We named this assignment system the gene signature–MELD (gs-MELD) score. In validation cohort 1, the gs-MELD score discriminated patients with poor survival (43% survived 90 days) from those with good survival (96% survived 90 days) (P <.001). The gs-MELD score also discriminated between patients with a poor survival at 180 days (34% survived) and a good survival at 180 days (84% survived) (P <.001). The time-dependent area under the receiver operator characteristic curve for the score was 0.86 (95% confidence interval 0.73–0.99) for survival at 90 days, and 0.83 (95% confidence interval 0.71–0.96) for survival at 180 days. This score outperformed other clinical models to predict survival of patients with severe AH in validation cohort 1. In validation cohort 2, the gs-MELD discriminated patients with a poor survival at 90 days (12% survived) from those with a good survival at 90 days (100%) (P <.001). Conclusions: We integrated data on baseline liver gene expression pattern and the MELD score to create the gs-MELD scoring system, which identifies patients with severe AH, treated or not with corticosteroids, most and least likely to survive for 90 and 180 days.}, note = {DOI: 10.1053/j.gastro.2017.10.048}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background & Aims: Patients with severe alcoholic hepatitis (AH) have a high risk of death within 90 days. Corticosteroids, which can cause severe adverse events, are the only treatment that increases short-term survival. It is a challenge to predict outcomes of patients with severe AH. Therefore, we developed a scoring system to predict patient survival, integrating baseline molecular and clinical variables. Methods: We obtained fixed liver biopsy samples from 71 consecutive patients diagnosed with severe AH and treated with corticosteroids from July 2006 through December 2013 in Brussels, Belgium (derivation cohort). Gene expression patterns were analyzed by microarrays and clinical data were collected for 180 days. We identified gene expression signatures and clinical data that are associated with survival without liver transplantation at 90 and 180 days after initiation of corticosteroid therapy. Findings were validated using liver biopsies from 48 consecutive patients with severe AH treated with corticosteroids, collected from March 2010 through February 2015 at hospitals in Belgium and Switzerland (validation cohort 1) and in liver biopsies from 20 patients (9 received corticosteroid treatment), collected from January 2012 through May 2015 in the United States (validation cohort 2). Results: We integrated data on expression patterns of 123 genes and the model for end-stage liver disease (MELD) scores to assign patients to groups with poor survival (29% survived 90 days and 26% survived 180 days) and good survival (76% survived 90 days and 65% survived 180 days) (P <.001) in the derivation cohort. We named this assignment system the gene signature–MELD (gs-MELD) score. In validation cohort 1, the gs-MELD score discriminated patients with poor survival (43% survived 90 days) from those with good survival (96% survived 90 days) (P <.001). The gs-MELD score also discriminated between patients with a poor survival at 180 days (34% survived) and a good survival at 180 days (84% survived) (P <.001). The time-dependent area under the receiver operator characteristic curve for the score was 0.86 (95% confidence interval 0.73–0.99) for survival at 90 days, and 0.83 (95% confidence interval 0.71–0.96) for survival at 180 days. This score outperformed other clinical models to predict survival of patients with severe AH in validation cohort 1. In validation cohort 2, the gs-MELD discriminated patients with a poor survival at 90 days (12% survived) from those with a good survival at 90 days (100%) (P <.001). Conclusions: We integrated data on baseline liver gene expression pattern and the MELD score to create the gs-MELD scoring system, which identifies patients with severe AH, treated or not with corticosteroids, most and least likely to survive for 90 and 180 days. |
Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Bontempi, Gianluca; Mauri, Giancarlo; Castiglioni, Isabella In-silico integration approach to identify a key miRNA regulating a gene network in aggressive prostate cancer Journal Article In: International journal of molecular sciences, 19 (3), 2018, (DOI: 10.3390/ijms19030910). @article{info:hdl:2013/270452, title = {In-silico integration approach to identify a key miRNA regulating a gene network in aggressive prostate cancer}, author = {Claudia Cava and Gloria Bertoli and Antonio Colaprico and Gianluca Bontempi and Giancarlo Mauri and Isabella Castiglioni}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/270452}, year = {2018}, date = {2018-01-01}, journal = {International journal of molecular sciences}, volume = {19}, number = {3}, abstract = {Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC.}, note = {DOI: 10.3390/ijms19030910}, keywords = {}, pubstate = {published}, tppubtype = {article} } Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. |
Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Olsen, Catharina; Bontempi, Gianluca; Castiglioni, Isabella Integration of multiple networks and pathways identifies cancer driver genes in pan-cancer analysis Journal Article In: BMC genomics, 19 (1), 2018, (DOI: 10.1186/s12864-017-4423-x). @article{info:hdl:2013/268430, title = {Integration of multiple networks and pathways identifies cancer driver genes in pan-cancer analysis}, author = {Claudia Cava and Gloria Bertoli and Antonio Colaprico and Catharina Olsen and Gianluca Bontempi and Isabella Castiglioni}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/268430}, year = {2018}, date = {2018-01-01}, journal = {BMC genomics}, volume = {19}, number = {1}, abstract = {Background: Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. Results: We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Conclusions: Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways.}, note = {DOI: 10.1186/s12864-017-4423-x}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background: Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. Results: We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Conclusions: Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways. |
Reggiani, Claudio; "e, Yann-A; Bontempi, Gianluca Feature selection in high-dimensional dataset using MapReduce Journal Article In: Communications in computer and information science, 823 , pp. 101-115, 2018, (DOI: 10.1007/978-3-319-76892-2_8). @article{info:hdl:2013/269455, title = {Feature selection in high-dimensional dataset using MapReduce}, author = {Claudio Reggiani and Yann-A{"e}l Le Borgne and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/269455}, year = {2018}, date = {2018-01-01}, journal = {Communications in computer and information science}, volume = {823}, pages = {101-115}, abstract = {This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems. The proposed approach handles both tall/narrow and wide/short datasets. We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving millions of observations or features.}, note = {DOI: 10.1007/978-3-319-76892-2_8}, keywords = {}, pubstate = {published}, tppubtype = {article} } This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems. The proposed approach handles both tall/narrow and wide/short datasets. We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving millions of observations or features. |
Gazzo, Andrea Beyond monogenic diseases: a first collection and analysis of digenic diseases PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/272617b, title = {Beyond monogenic diseases: a first collection and analysis of digenic diseases}, author = {Andrea Gazzo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/272617/5/ContratDiGazzo.pdf}, year = {2018}, date = {2018-01-01}, abstract = {In the next generation sequencing era many bioinformatics tools have been developed for assisting scientists in their studies on the molecular basis of genetic diseases, often with the aim of identifying the pathogenic variants. As a consequence, in the last decades more than one hundred new disease-gene associations have been discovered. Nevertheless, the genetic basis of many genetic diseases yet remains undisclosed. It has been shown that many diseases considered as monogenic with an imperfect genotype-phenotype correlation or incomplete penetrance are, on the contrary, caused or modulated by more than one mutated gene, meaning that they are in fact oligogenic. Current bioinformatics methods used for identifying pathogenic variants are trained and fine-tuned for identifying a single variant responsible of a disease. This monogenic-oriented approach cannot be used to explore the impact of combinations of variants in different genes on the complexity and genetic heterogeneity of rare diseases. Digenic diseases are the simplest form of oligogenic disease and thus they can provide a conceptual bridge between monogenic and the poorly understood polygenic diseases.The ambition of this thesis is to collect and analyse digenic data, introducing this topic in the bioinformatics field where digenic diseases are still an unexplored branch. This can be divided in two steps: the first consists in the creation of a central repository containing detailed information on digenic diseases; the second is an analysis of their peculiarities, using machine learning methods for studying subclasses of digenic effects.In the first step we developed DIDA (DIgenic diseases DAtabase), a novel database that provides for the first time a curated collection of genes and associated variants involved in digenic diseases. Detailed information related to the digenic mechanism have been manually mined from the medical literature. All instances in DIDA were also assigned to two sub classes of digenic effects, annotated as true digenic (both genes are required for developing the disease) and composite classes (one gene is sufficient to produce the disease phenotype, the second one alters it or change significantly the age of onset).In the second step, we hypothesized that the digenic effect may be related to some biological properties characterizing digenic combinations. Using machine learning methods, we show that a set of variant, gene and higher-level features can differentiate between the true digenic and composite classes with high accuracy. Moreover, we show that a digenic effect decision profile, extracted from the predictive model, motivates why an instance is assigned to either of the two classes.Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } In the next generation sequencing era many bioinformatics tools have been developed for assisting scientists in their studies on the molecular basis of genetic diseases, often with the aim of identifying the pathogenic variants. As a consequence, in the last decades more than one hundred new disease-gene associations have been discovered. Nevertheless, the genetic basis of many genetic diseases yet remains undisclosed. It has been shown that many diseases considered as monogenic with an imperfect genotype-phenotype correlation or incomplete penetrance are, on the contrary, caused or modulated by more than one mutated gene, meaning that they are in fact oligogenic. Current bioinformatics methods used for identifying pathogenic variants are trained and fine-tuned for identifying a single variant responsible of a disease. This monogenic-oriented approach cannot be used to explore the impact of combinations of variants in different genes on the complexity and genetic heterogeneity of rare diseases. Digenic diseases are the simplest form of oligogenic disease and thus they can provide a conceptual bridge between monogenic and the poorly understood polygenic diseases.The ambition of this thesis is to collect and analyse digenic data, introducing this topic in the bioinformatics field where digenic diseases are still an unexplored branch. This can be divided in two steps: the first consists in the creation of a central repository containing detailed information on digenic diseases; the second is an analysis of their peculiarities, using machine learning methods for studying subclasses of digenic effects.In the first step we developed DIDA (DIgenic diseases DAtabase), a novel database that provides for the first time a curated collection of genes and associated variants involved in digenic diseases. Detailed information related to the digenic mechanism have been manually mined from the medical literature. All instances in DIDA were also assigned to two sub classes of digenic effects, annotated as true digenic (both genes are required for developing the disease) and composite classes (one gene is sufficient to produce the disease phenotype, the second one alters it or change significantly the age of onset).In the second step, we hypothesized that the digenic effect may be related to some biological properties characterizing digenic combinations. Using machine learning methods, we show that a set of variant, gene and higher-level features can differentiate between the true digenic and composite classes with high accuracy. Moreover, we show that a digenic effect decision profile, extracted from the predictive model, motivates why an instance is assigned to either of the two classes.Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm. |
Reggiani, Claudio 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/270994b, title = {Bioinformatic discovery of novel exons expressed in human brain and their association with neurodevelopmental disorders}, author = {Claudio Reggiani}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/270994/5/ContratDiReggiani.pdf}, year = {2018}, date = {2018-01-01}, abstract = {An important quest in genomics since the publication of the first complete human genome in 2003 has been its functional annotation. DNA holds the instructions to the production of the components necessary for the life of cells and organisms. A complete functional catalog of genomic regions will help the understanding of the cell body and its dynamics, thus creating links between genotype and phenotypic traits. The need for annotations prompted the development of several bioinformatic methods. In the context of promoter and first exon predictors, the majority of models relies principally on structural and chemical properties of the DNA sequence. Some of them integrate information from epigenomic and transcriptomic data as secondary features. Current genomic research asserts that reference genome annotations are far from being fully annotated (human organism included).Physicians rely on reference genome annotations and functional databases to understand disorders with genetic basis, and missing annotations may lead to unresolved cases. Because of their complexity, neurodevelopmental disorders are under study to figure out all genetic regions that are involved. Besides functional validation on model organisms, the search for genotype-phenotype association is supported by statistical analysis, which is typically biased towards known functional regions.This thesis addresses the use of an in-silico integrative analysis to improve reference genome annotations and discover novel functional regions associated with neurodevelopemental disorders. The contributions outlined in this document have practical applications in clinical settings. The presented bioinformatic method is based on epigenomic and transcriptomic data, thus excluding features from DNA sequence. Such integrative approach applied on brain data allowed the discovery of two novel promoters and coding first exons in the human DLG2 gene, which were also found to be statistically associated with neurodevelopmental disorders and intellectual disability in particular. The application of the same methodology to the whole genome resulted in the discovery of other novel exons expressed in brain. Concerning the in-silico method itself, the research demanded a high number of functional and clinical datasets to properly support and validate our discoveries.This work describes a bioinformatic method for genome annotation, in the specific area of promoter and first exons. So far the method has been applied on brain data, and the extension to the whole body data would be a logical by-product. We will leverage distributed frameworks to tackle the even higher amount of data to analyse, a task that has already begun. Another interesting research direction that came up from this work is the temporal enrichment analysis of epigenomics data across different developmental stages, in which changes of epigenomic enrichment suggest time-specific and tissue-specific functional gene and gene isoforms regulation.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } An important quest in genomics since the publication of the first complete human genome in 2003 has been its functional annotation. DNA holds the instructions to the production of the components necessary for the life of cells and organisms. A complete functional catalog of genomic regions will help the understanding of the cell body and its dynamics, thus creating links between genotype and phenotypic traits. The need for annotations prompted the development of several bioinformatic methods. In the context of promoter and first exon predictors, the majority of models relies principally on structural and chemical properties of the DNA sequence. Some of them integrate information from epigenomic and transcriptomic data as secondary features. Current genomic research asserts that reference genome annotations are far from being fully annotated (human organism included).Physicians rely on reference genome annotations and functional databases to understand disorders with genetic basis, and missing annotations may lead to unresolved cases. Because of their complexity, neurodevelopmental disorders are under study to figure out all genetic regions that are involved. Besides functional validation on model organisms, the search for genotype-phenotype association is supported by statistical analysis, which is typically biased towards known functional regions.This thesis addresses the use of an in-silico integrative analysis to improve reference genome annotations and discover novel functional regions associated with neurodevelopemental disorders. The contributions outlined in this document have practical applications in clinical settings. The presented bioinformatic method is based on epigenomic and transcriptomic data, thus excluding features from DNA sequence. Such integrative approach applied on brain data allowed the discovery of two novel promoters and coding first exons in the human DLG2 gene, which were also found to be statistically associated with neurodevelopmental disorders and intellectual disability in particular. The application of the same methodology to the whole genome resulted in the discovery of other novel exons expressed in brain. Concerning the in-silico method itself, the research demanded a high number of functional and clinical datasets to properly support and validate our discoveries.This work describes a bioinformatic method for genome annotation, in the specific area of promoter and first exons. So far the method has been applied on brain data, and the extension to the whole body data would be a logical by-product. We will leverage distributed frameworks to tackle the even higher amount of data to analyse, a task that has already begun. Another interesting research direction that came up from this work is the temporal enrichment analysis of epigenomics data across different developmental stages, in which changes of epigenomic enrichment suggest time-specific and tissue-specific functional gene and gene isoforms regulation. |
Gazzo, Andrea Beyond monogenic diseases: a first collection and analysis of digenic diseases PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/272617, title = {Beyond monogenic diseases: a first collection and analysis of digenic diseases}, author = {Andrea Gazzo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/272617/5/ContratDiGazzo.pdf}, year = {2018}, date = {2018-01-01}, abstract = {In the next generation sequencing era many bioinformatics tools have been developed for assisting scientists in their studies on the molecular basis of genetic diseases, often with the aim of identifying the pathogenic variants. As a consequence, in the last decades more than one hundred new disease-gene associations have been discovered. Nevertheless, the genetic basis of many genetic diseases yet remains undisclosed. It has been shown that many diseases considered as monogenic with an imperfect genotype-phenotype correlation or incomplete penetrance are, on the contrary, caused or modulated by more than one mutated gene, meaning that they are in fact oligogenic. Current bioinformatics methods used for identifying pathogenic variants are trained and fine-tuned for identifying a single variant responsible of a disease. This monogenic-oriented approach cannot be used to explore the impact of combinations of variants in different genes on the complexity and genetic heterogeneity of rare diseases. Digenic diseases are the simplest form of oligogenic disease and thus they can provide a conceptual bridge between monogenic and the poorly understood polygenic diseases.The ambition of this thesis is to collect and analyse digenic data, introducing this topic in the bioinformatics field where digenic diseases are still an unexplored branch. This can be divided in two steps: the first consists in the creation of a central repository containing detailed information on digenic diseases; the second is an analysis of their peculiarities, using machine learning methods for studying subclasses of digenic effects.In the first step we developed DIDA (DIgenic diseases DAtabase), a novel database that provides for the first time a curated collection of genes and associated variants involved in digenic diseases. Detailed information related to the digenic mechanism have been manually mined from the medical literature. All instances in DIDA were also assigned to two sub classes of digenic effects, annotated as true digenic (both genes are required for developing the disease) and composite classes (one gene is sufficient to produce the disease phenotype, the second one alters it or change significantly the age of onset).In the second step, we hypothesized that the digenic effect may be related to some biological properties characterizing digenic combinations. Using machine learning methods, we show that a set of variant, gene and higher-level features can differentiate between the true digenic and composite classes with high accuracy. Moreover, we show that a digenic effect decision profile, extracted from the predictive model, motivates why an instance is assigned to either of the two classes.Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } In the next generation sequencing era many bioinformatics tools have been developed for assisting scientists in their studies on the molecular basis of genetic diseases, often with the aim of identifying the pathogenic variants. As a consequence, in the last decades more than one hundred new disease-gene associations have been discovered. Nevertheless, the genetic basis of many genetic diseases yet remains undisclosed. It has been shown that many diseases considered as monogenic with an imperfect genotype-phenotype correlation or incomplete penetrance are, on the contrary, caused or modulated by more than one mutated gene, meaning that they are in fact oligogenic. Current bioinformatics methods used for identifying pathogenic variants are trained and fine-tuned for identifying a single variant responsible of a disease. This monogenic-oriented approach cannot be used to explore the impact of combinations of variants in different genes on the complexity and genetic heterogeneity of rare diseases. Digenic diseases are the simplest form of oligogenic disease and thus they can provide a conceptual bridge between monogenic and the poorly understood polygenic diseases.The ambition of this thesis is to collect and analyse digenic data, introducing this topic in the bioinformatics field where digenic diseases are still an unexplored branch. This can be divided in two steps: the first consists in the creation of a central repository containing detailed information on digenic diseases; the second is an analysis of their peculiarities, using machine learning methods for studying subclasses of digenic effects.In the first step we developed DIDA (DIgenic diseases DAtabase), a novel database that provides for the first time a curated collection of genes and associated variants involved in digenic diseases. Detailed information related to the digenic mechanism have been manually mined from the medical literature. All instances in DIDA were also assigned to two sub classes of digenic effects, annotated as true digenic (both genes are required for developing the disease) and composite classes (one gene is sufficient to produce the disease phenotype, the second one alters it or change significantly the age of onset).In the second step, we hypothesized that the digenic effect may be related to some biological properties characterizing digenic combinations. Using machine learning methods, we show that a set of variant, gene and higher-level features can differentiate between the true digenic and composite classes with high accuracy. Moreover, we show that a digenic effect decision profile, extracted from the predictive model, motivates why an instance is assigned to either of the two classes.Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm. |
Reggiani, Claudio 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/270994, title = {Bioinformatic discovery of novel exons expressed in human brain and their association with neurodevelopmental disorders}, author = {Claudio Reggiani}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/270994/5/ContratDiReggiani.pdf}, year = {2018}, date = {2018-01-01}, abstract = {An important quest in genomics since the publication of the first complete human genome in 2003 has been its functional annotation. DNA holds the instructions to the production of the components necessary for the life of cells and organisms. A complete functional catalog of genomic regions will help the understanding of the cell body and its dynamics, thus creating links between genotype and phenotypic traits. The need for annotations prompted the development of several bioinformatic methods. In the context of promoter and first exon predictors, the majority of models relies principally on structural and chemical properties of the DNA sequence. Some of them integrate information from epigenomic and transcriptomic data as secondary features. Current genomic research asserts that reference genome annotations are far from being fully annotated (human organism included).Physicians rely on reference genome annotations and functional databases to understand disorders with genetic basis, and missing annotations may lead to unresolved cases. Because of their complexity, neurodevelopmental disorders are under study to figure out all genetic regions that are involved. Besides functional validation on model organisms, the search for genotype-phenotype association is supported by statistical analysis, which is typically biased towards known functional regions.This thesis addresses the use of an in-silico integrative analysis to improve reference genome annotations and discover novel functional regions associated with neurodevelopemental disorders. The contributions outlined in this document have practical applications in clinical settings. The presented bioinformatic method is based on epigenomic and transcriptomic data, thus excluding features from DNA sequence. Such integrative approach applied on brain data allowed the discovery of two novel promoters and coding first exons in the human DLG2 gene, which were also found to be statistically associated with neurodevelopmental disorders and intellectual disability in particular. The application of the same methodology to the whole genome resulted in the discovery of other novel exons expressed in brain. Concerning the in-silico method itself, the research demanded a high number of functional and clinical datasets to properly support and validate our discoveries.This work describes a bioinformatic method for genome annotation, in the specific area of promoter and first exons. So far the method has been applied on brain data, and the extension to the whole body data would be a logical by-product. We will leverage distributed frameworks to tackle the even higher amount of data to analyse, a task that has already begun. Another interesting research direction that came up from this work is the temporal enrichment analysis of epigenomics data across different developmental stages, in which changes of epigenomic enrichment suggest time-specific and tissue-specific functional gene and gene isoforms regulation.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } An important quest in genomics since the publication of the first complete human genome in 2003 has been its functional annotation. DNA holds the instructions to the production of the components necessary for the life of cells and organisms. A complete functional catalog of genomic regions will help the understanding of the cell body and its dynamics, thus creating links between genotype and phenotypic traits. The need for annotations prompted the development of several bioinformatic methods. In the context of promoter and first exon predictors, the majority of models relies principally on structural and chemical properties of the DNA sequence. Some of them integrate information from epigenomic and transcriptomic data as secondary features. Current genomic research asserts that reference genome annotations are far from being fully annotated (human organism included).Physicians rely on reference genome annotations and functional databases to understand disorders with genetic basis, and missing annotations may lead to unresolved cases. Because of their complexity, neurodevelopmental disorders are under study to figure out all genetic regions that are involved. Besides functional validation on model organisms, the search for genotype-phenotype association is supported by statistical analysis, which is typically biased towards known functional regions.This thesis addresses the use of an in-silico integrative analysis to improve reference genome annotations and discover novel functional regions associated with neurodevelopemental disorders. The contributions outlined in this document have practical applications in clinical settings. The presented bioinformatic method is based on epigenomic and transcriptomic data, thus excluding features from DNA sequence. Such integrative approach applied on brain data allowed the discovery of two novel promoters and coding first exons in the human DLG2 gene, which were also found to be statistically associated with neurodevelopmental disorders and intellectual disability in particular. The application of the same methodology to the whole genome resulted in the discovery of other novel exons expressed in brain. Concerning the in-silico method itself, the research demanded a high number of functional and clinical datasets to properly support and validate our discoveries.This work describes a bioinformatic method for genome annotation, in the specific area of promoter and first exons. So far the method has been applied on brain data, and the extension to the whole body data would be a logical by-product. We will leverage distributed frameworks to tackle the even higher amount of data to analyse, a task that has already begun. Another interesting research direction that came up from this work is the temporal enrichment analysis of epigenomics data across different developmental stages, in which changes of epigenomic enrichment suggest time-specific and tissue-specific functional gene and gene isoforms regulation. |
Fimereli, Danai Computational analyses of gene fusions, viruses and parasitic genomic elements in breast cancer PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/263609, title = {Computational analyses of gene fusions, viruses and parasitic genomic elements in breast cancer}, author = {Danai Fimereli}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/263609/5/ContratDanaiFimereli.pdf}, year = {2018}, date = {2018-01-01}, abstract = {Breast cancer is the most common cancer in women and research efforts to unravel the underlying mechanisms that drive carcinogenesis are continuous. The emergence of high-throughput sequencing techniques and their constant advancement, in combination with large scale studies of genomic and transcriptomic data, allowed the identification of important genetic changes that take place in the breast cancer genome, including somatic mutations, copy number aberrations and genomic rearrangements.The overall aim of this thesis is to explore the presence of genetic changes that take place in the breast cancer transcriptome and their possible contribution to carcinogenesis. The aim of the first research study was the identification of expressed gene fusions in breast cancer and the study of their association with other genomic events. For achieving this, transcriptome sequencing and Single Nucleotide Polymorphism arrays data for a cohort of 55 tumors and 10 normal breast tissues were combined. Gene fusions were detected in the majority of the samples, with evident differences between breast cancer subtypes, where HER2+ samples had significantly more fusions than the other subtypes. The genome-wide analysis uncovered localization of fusion genes in specific chromosomes like 17, 8 or 20. Additionally, a positive correlation between the number of gene fusions and the number of amplifications was observed, including the association between fusions on chromosome 17 and the amplifications in HER2+ samples, which can be attributed to the highly rearranged genomes of these subtypes. Finally, the absence of highly recurrent fusions across this cohort adds to the notion that gene fusions in breast cancer are most likely private events, with the majority being “passenger” events. In the second research study, the aim was to identify a connection between viral infections and breast cancer by devising five different computational methods for the analysis of both transcriptome and exome data in a cohort of 58 breast tumors. Despite being able to detect viral sequences in our testing dataset, no significantly high numbers of viral sequences were detected in our samples. Specifically, viral sequences (~2-30 reads) were extracted belonging to viruses EBV, HHV6 and Merkel cell polyomavirus. Such low levels of viral expression direct against a viral etiology for breast cancer but one should not exclude possible cases of integrated but silent viruses.In the third research project, we analyzed in silico the transcriptional profiles of human endogenous retroviruses in breast cancer. Despite being scattered across the genome in large numbers, a number of ERVs are actively transcribed, consisting of a small percentage of the total mapped reads. Alongside protein coding genes and lncRNAs, they show distinct expression profiles across the different breast cancer subtypes with luminal and basal-like samples clear separating from each other. Additionally, distinct profiles between ER+ and ER- samples were observed. Tumor specific ERV loci show an association with the immune status of the tumors, indicating that ERVs are reactivated in tumors and could play a role in the activation of the immune response cascade.The results presented in this thesis exhibit only in a small fragment the diversity and heterogeneity of the breast cancer transcriptome. The strength of the sequencing techniques allows the in depth detection of different genomic events. Gene fusions should be considered as part of the breast cancer transcriptome but their low recurrence across samples indicates for a role as passenger events. Under the light of existing results, viral infections do not play a significant role in breast cancer. On the other hand, human endogenous retroviruses, despite originating from exogenous viruses, seems to exhibit transcriptional profiles similar to those of normal genes, indicating that they are part of the genome’s transcriptional machinery.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Breast cancer is the most common cancer in women and research efforts to unravel the underlying mechanisms that drive carcinogenesis are continuous. The emergence of high-throughput sequencing techniques and their constant advancement, in combination with large scale studies of genomic and transcriptomic data, allowed the identification of important genetic changes that take place in the breast cancer genome, including somatic mutations, copy number aberrations and genomic rearrangements.The overall aim of this thesis is to explore the presence of genetic changes that take place in the breast cancer transcriptome and their possible contribution to carcinogenesis. The aim of the first research study was the identification of expressed gene fusions in breast cancer and the study of their association with other genomic events. For achieving this, transcriptome sequencing and Single Nucleotide Polymorphism arrays data for a cohort of 55 tumors and 10 normal breast tissues were combined. Gene fusions were detected in the majority of the samples, with evident differences between breast cancer subtypes, where HER2+ samples had significantly more fusions than the other subtypes. The genome-wide analysis uncovered localization of fusion genes in specific chromosomes like 17, 8 or 20. Additionally, a positive correlation between the number of gene fusions and the number of amplifications was observed, including the association between fusions on chromosome 17 and the amplifications in HER2+ samples, which can be attributed to the highly rearranged genomes of these subtypes. Finally, the absence of highly recurrent fusions across this cohort adds to the notion that gene fusions in breast cancer are most likely private events, with the majority being “passenger” events. In the second research study, the aim was to identify a connection between viral infections and breast cancer by devising five different computational methods for the analysis of both transcriptome and exome data in a cohort of 58 breast tumors. Despite being able to detect viral sequences in our testing dataset, no significantly high numbers of viral sequences were detected in our samples. Specifically, viral sequences (~2-30 reads) were extracted belonging to viruses EBV, HHV6 and Merkel cell polyomavirus. Such low levels of viral expression direct against a viral etiology for breast cancer but one should not exclude possible cases of integrated but silent viruses.In the third research project, we analyzed in silico the transcriptional profiles of human endogenous retroviruses in breast cancer. Despite being scattered across the genome in large numbers, a number of ERVs are actively transcribed, consisting of a small percentage of the total mapped reads. Alongside protein coding genes and lncRNAs, they show distinct expression profiles across the different breast cancer subtypes with luminal and basal-like samples clear separating from each other. Additionally, distinct profiles between ER+ and ER- samples were observed. Tumor specific ERV loci show an association with the immune status of the tumors, indicating that ERVs are reactivated in tumors and could play a role in the activation of the immune response cascade.The results presented in this thesis exhibit only in a small fragment the diversity and heterogeneity of the breast cancer transcriptome. The strength of the sequencing techniques allows the in depth detection of different genomic events. Gene fusions should be considered as part of the breast cancer transcriptome but their low recurrence across samples indicates for a role as passenger events. Under the light of existing results, viral infections do not play a significant role in breast cancer. On the other hand, human endogenous retroviruses, despite originating from exogenous viruses, seems to exhibit transcriptional profiles similar to those of normal genes, indicating that they are part of the genome’s transcriptional machinery. |
Dierckxsens, Nicolas Targeted organelle genome assembly and heteroplamsy detection PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/277507, title = {Targeted organelle genome assembly and heteroplamsy detection}, author = {Nicolas Dierckxsens}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/277507/5/ContratDiDierckxsens.pdf}, year = {2018}, date = {2018-01-01}, abstract = {Thanks to the development of next-generation sequencing (NGS) technology, whole genome data can be readily obtained from a variety of samples. Since the massive increase in available sequencing data, the development of efficient assembly algorithms has become the new bottleneck. Almost every new released tool is based on the De Brujin graph method, which focuses on assembling complete datasets with mathematical models. Although the decreasing sequencing costs made whole genome sequencing (WGS) the most straightforward and least laborious approach of gathering sequencing data, many research projects are only interested in the extranuclear genomes. Unfortunately, few of the available tools are specifically designed to efficiently retrieve these extranuclear genomes from WGS datasets. We developed a seed-and-extend algorithm that assembles organelle circular genomes from WGS data, starting from a single short seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina datasets and always outperformed other assemblers in assembly accuracy and contiguity. In our benchmark, NOVOPlasty assembled all genomes in less than 30 minutes with a maximum RAM memory requirement of 16 GB. NOVOPlasty is the only de novo assembler that provides a fast and straightforward manner to extract the extranuclear sequences from WGS data and generates one circular high quality contig.Heteroplasmy, the existence of multiple mitochondrial haplotypes within an individual, has been researched across different fields. Mitochondrial genome polymorphisms have been linked to multiple severe disorders and are of interest to evolutionary studies and forensic science. By utilizing ultra-deep sequencing, it is now possible to uncover previously undiscovered patterns of intra-individual polymorphism. However, it remains challenging to determine its source. Current available software can detect polymorphic sites but are not capable of determining the link between them. We therefore developed a new method to not only detect intra-individual polymorphisms within mitochondrial and chloroplast genomes, but also to look for linkage among polymorphic sites by assembling the sequence around each detected polymorphic site. Our benchmark study shows that this method can detect heteroplasmy more accurately than any method previously available and is the first tool that is able to completely or partially reconstruct the origin sequences for each intra-individual polymorphism.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Thanks to the development of next-generation sequencing (NGS) technology, whole genome data can be readily obtained from a variety of samples. Since the massive increase in available sequencing data, the development of efficient assembly algorithms has become the new bottleneck. Almost every new released tool is based on the De Brujin graph method, which focuses on assembling complete datasets with mathematical models. Although the decreasing sequencing costs made whole genome sequencing (WGS) the most straightforward and least laborious approach of gathering sequencing data, many research projects are only interested in the extranuclear genomes. Unfortunately, few of the available tools are specifically designed to efficiently retrieve these extranuclear genomes from WGS datasets. We developed a seed-and-extend algorithm that assembles organelle circular genomes from WGS data, starting from a single short seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina datasets and always outperformed other assemblers in assembly accuracy and contiguity. In our benchmark, NOVOPlasty assembled all genomes in less than 30 minutes with a maximum RAM memory requirement of 16 GB. NOVOPlasty is the only de novo assembler that provides a fast and straightforward manner to extract the extranuclear sequences from WGS data and generates one circular high quality contig.Heteroplasmy, the existence of multiple mitochondrial haplotypes within an individual, has been researched across different fields. Mitochondrial genome polymorphisms have been linked to multiple severe disorders and are of interest to evolutionary studies and forensic science. By utilizing ultra-deep sequencing, it is now possible to uncover previously undiscovered patterns of intra-individual polymorphism. However, it remains challenging to determine its source. Current available software can detect polymorphic sites but are not capable of determining the link between them. We therefore developed a new method to not only detect intra-individual polymorphisms within mitochondrial and chloroplast genomes, but also to look for linkage among polymorphic sites by assembling the sequence around each detected polymorphic site. Our benchmark study shows that this method can detect heteroplasmy more accurately than any method previously available and is the first tool that is able to completely or partially reconstruct the origin sequences for each intra-individual polymorphism. |
Carcillo, Fabrizio 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/272119, title = {Beyond Supervised Learning in Credit Card Fraud Detection: A Dive into Semi-supervised and Distributed Learning}, author = {Fabrizio Carcillo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/272119/5/ContratDiCarcillo.pdf}, year = {2018}, date = {2018-01-01}, abstract = {The expansion of the electronic commerce, as well as the increasing confidence of customers in electronic payments, makes of fraud detection a critical issue. The design of a prompt and accurate Fraud Detection System is a priority for many organizations in the business of credit cards. In this thesis we present a series of studies to increase the precision and the speed of fraud detection system. The thesis has three main contributions. The first concerns the integration of unsupervised techniques and supervised classifiers. We proposed several approaches to integrate outlier scores in the detection process and we found that the accuracy of a conventional classifier may be improved when information about the input distribution is used to augment the training set.The second contribution concerns the role of active learning in Fraud Detection. We have extensively compared several state-of-the-art techniques and found that Stochastic Semi-supervised Learning is a convenient approach to tackle the Selection Bias problem in the active learning process.The third contribution of the thesis is the design, implementation and assessment of SCARFF, an original framework for near real-time Streaming Fraud Detection. This framework integrates Big Data technology (notably tools like Kafka, Spark and Cassandra) with a machine learning approach to deal with imbalance, non-stationarity and feedback latency in a scalable manner. Experimental results on a massive dataset of real credit card transactions have showed that our framework is scalable, efficient and accurate over a big stream of transactions.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The expansion of the electronic commerce, as well as the increasing confidence of customers in electronic payments, makes of fraud detection a critical issue. The design of a prompt and accurate Fraud Detection System is a priority for many organizations in the business of credit cards. In this thesis we present a series of studies to increase the precision and the speed of fraud detection system. The thesis has three main contributions. The first concerns the integration of unsupervised techniques and supervised classifiers. We proposed several approaches to integrate outlier scores in the detection process and we found that the accuracy of a conventional classifier may be improved when information about the input distribution is used to augment the training set.The second contribution concerns the role of active learning in Fraud Detection. We have extensively compared several state-of-the-art techniques and found that Stochastic Semi-supervised Learning is a convenient approach to tackle the Selection Bias problem in the active learning process.The third contribution of the thesis is the design, implementation and assessment of SCARFF, an original framework for near real-time Streaming Fraud Detection. This framework integrates Big Data technology (notably tools like Kafka, Spark and Cassandra) with a machine learning approach to deal with imbalance, non-stationarity and feedback latency in a scalable manner. Experimental results on a massive dataset of real credit card transactions have showed that our framework is scalable, efficient and accurate over a big stream of transactions. |
Porretta'S, Luciano MODELS AND METHODS IN GENOME WIDE ASSOCIATION STUDIES PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/265314, title = {MODELS AND METHODS IN GENOME WIDE ASSOCIATION STUDIES}, author = {Luciano Porretta'S}, year = {2018}, date = {2018-01-01}, abstract = {The interdisciplinary field of systems biology has evolved rapidly over the last few years. Different disciplines have contributed to the development of both its experimental and theoretical branches.Although computational biology has been an increasing activity in computer science for more than a two decades, it has been only in the past few years that optimization models have been increasingly developed and analyzed by researchers whose primary background is Operations Research(OR). This dissertation aims at contributing to the field of computational biology by applying mathematical programming to certain problems in molecular biology.Specifically, we address three problems in the domain of Genome Wide Association Studies:(i) the Pure Parsimony Haplotyping Under uncertatind Data Problem that consists in finding the minimum number of haplotypes necessary to explain a given set of genotypes containing possible reading errors; (ii) the Parsimonious Loss Of Heterozygosity Problem that consists of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas; (iii) and the Multiple Individuals Polymorphic Alu Insertion Recognition Problem that consists of finding the set of locations in the genome where ALU sequences are inserted in some individual(s).All three problems are NP-hard combinatorial optimization problems. Therefore, we analyse their combinatorial structure and we propose an exact approach to solution for each of them. The proposed models are efficient, accurate, compact, polynomial-sized and usable in all those cases for which the parsimony criterion is well suited for estimation.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The interdisciplinary field of systems biology has evolved rapidly over the last few years. Different disciplines have contributed to the development of both its experimental and theoretical branches.Although computational biology has been an increasing activity in computer science for more than a two decades, it has been only in the past few years that optimization models have been increasingly developed and analyzed by researchers whose primary background is Operations Research(OR). This dissertation aims at contributing to the field of computational biology by applying mathematical programming to certain problems in molecular biology.Specifically, we address three problems in the domain of Genome Wide Association Studies:(i) the Pure Parsimony Haplotyping Under uncertatind Data Problem that consists in finding the minimum number of haplotypes necessary to explain a given set of genotypes containing possible reading errors; (ii) the Parsimonious Loss Of Heterozygosity Problem that consists of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas; (iii) and the Multiple Individuals Polymorphic Alu Insertion Recognition Problem that consists of finding the set of locations in the genome where ALU sequences are inserted in some individual(s).All three problems are NP-hard combinatorial optimization problems. Therefore, we analyse their combinatorial structure and we propose an exact approach to solution for each of them. The proposed models are efficient, accurate, compact, polynomial-sized and usable in all those cases for which the parsimony criterion is well suited for estimation. |
Bizet, Martin Bioinformatic inference of a prognostic epigenetic signature of immunity in breast cancers PhD Thesis 2018, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/265092, title = {Bioinformatic inference of a prognostic epigenetic signature of immunity in breast cancers}, author = {Martin Bizet}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/265092/7/ContratDiBizet.pdf}, year = {2018}, date = {2018-01-01}, abstract = {L’altération des marques épigénétiques est de plus en plus reconnue comme une caractéristique fondamentale des cancers. Dans cette th`ese, nous avons utilisé des profils de méthylation de l’ADN en vue d’améliorer la classification des patients atteints du cancer du sein gr^ace `a une approche basée sur l’apprentissage automatique. L’objectif `a long terme est le développement d’outils cliniques de médecine personnalisée. Les données de méthylation de l’ADN furent acquises `a l’aide d’une puce `a ADN dédiée `a la méthylation, appelée Infinium. Cette technologie est récente comparée, par exemple, aux puces d’expression génique et son prétraitement n’est pas encore standardisé. La premi`ere partie de cette th`ese fut donc consacrée `a l’évaluation des méthodes de normalisation par comparaison des données normalisées avec d’autres technologies (pyroséquenccage et RRBS) pour les deux technologies Infinium les plus récentes (450k et 850k). Nous avons également évalué la couverture de régions biologiquement relevantes (promoteurs et amplificateurs) par les deux technologies. Ensuite, nous avons utilisé les données Infinium (correctement prétraitées) pour développer un score, appelé MeTIL score, qui présente une valeur pronostique et prédictive dans les cancers du sein. Nous avons profité de la capacité de la méthylation de l’ADN `a refléter la composition cellulaire pour extraire une signature de méthylation (c’est-`a-dire un ensemble de positions de l’ADN o`u la méthylation varie) qui refl`ete la présence de lymphocytes dans l’échantillon tumoral. Apr`es une sélection de sites présentant une méthylation spécifique aux lymphocytes, nous avons développé une approche basée sur l’apprentissage automatique pour obtenir une signature d’une tailleoptimale réduite `a cinq sites permettant potentiellement une utilisation en clinique. Apr`es conversion de cette signature en un score, nous avons montré sa spécificité pour les lymphocytes `a l’aide de données externes et de simulations informatiques. Puis, nous avons montré la capacité du MeTIL score `a prédire la réponse `a la chimiothérapie ainsi que son pouvoir pronostique dans des cohortes indépendantes de cancer du sein et, m^eme, dans d’autres cancers.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } L’altération des marques épigénétiques est de plus en plus reconnue comme une caractéristique fondamentale des cancers. Dans cette th`ese, nous avons utilisé des profils de méthylation de l’ADN en vue d’améliorer la classification des patients atteints du cancer du sein gr^ace `a une approche basée sur l’apprentissage automatique. L’objectif `a long terme est le développement d’outils cliniques de médecine personnalisée. Les données de méthylation de l’ADN furent acquises `a l’aide d’une puce `a ADN dédiée `a la méthylation, appelée Infinium. Cette technologie est récente comparée, par exemple, aux puces d’expression génique et son prétraitement n’est pas encore standardisé. La premi`ere partie de cette th`ese fut donc consacrée `a l’évaluation des méthodes de normalisation par comparaison des données normalisées avec d’autres technologies (pyroséquenccage et RRBS) pour les deux technologies Infinium les plus récentes (450k et 850k). Nous avons également évalué la couverture de régions biologiquement relevantes (promoteurs et amplificateurs) par les deux technologies. Ensuite, nous avons utilisé les données Infinium (correctement prétraitées) pour développer un score, appelé MeTIL score, qui présente une valeur pronostique et prédictive dans les cancers du sein. Nous avons profité de la capacité de la méthylation de l’ADN `a refléter la composition cellulaire pour extraire une signature de méthylation (c’est-`a-dire un ensemble de positions de l’ADN o`u la méthylation varie) qui refl`ete la présence de lymphocytes dans l’échantillon tumoral. Apr`es une sélection de sites présentant une méthylation spécifique aux lymphocytes, nous avons développé une approche basée sur l’apprentissage automatique pour obtenir une signature d’une tailleoptimale réduite `a cinq sites permettant potentiellement une utilisation en clinique. Apr`es conversion de cette signature en un score, nous avons montré sa spécificité pour les lymphocytes `a l’aide de données externes et de simulations informatiques. Puis, nous avons montré la capacité du MeTIL score `a prédire la réponse `a la chimiothérapie ainsi que son pouvoir pronostique dans des cohortes indépendantes de cancer du sein et, m^eme, dans d’autres cancers. |
2017 |
Huculeci, Radu Ion; Kieken, Fabien; Garcia-Pino, Abel; Buts, Lieven; van Nuland, Nico A J; Lenaerts, Tom Structural characterization of monomeric/dimeric state of p59fyn SH2 domain Book Chapter In: 1555 , pp. 555, Humana Press, 2017, (Language of publication: fr). @inbook{info:hdl:2013/243589, title = {Structural characterization of monomeric/dimeric state of p59fyn SH2 domain}, author = {Radu Ion Huculeci and Fabien Kieken and Abel Garcia-Pino and Lieven Buts and Nico A J van Nuland and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243589}, year = {2017}, date = {2017-01-01}, volume = {1555}, pages = {555}, publisher = {Humana Press}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } |
Kieken, Fabien; Loth, Karine; van Nuland, Nico A J; Tompa, Peter; Lenaerts, Tom Chemical shift assignments of the partially deuterated Fyn SH2-SH3 domain. Journal Article In: Biomolecular NMR assignments, 2017, (DOI: 10.1007/s12104-017-9792-1). @article{info:hdl:2013/262899, title = {Chemical shift assignments of the partially deuterated Fyn SH2-SH3 domain.}, author = {Fabien Kieken and Karine Loth and Nico A J van Nuland and Peter Tompa and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262899}, year = {2017}, date = {2017-01-01}, journal = {Biomolecular NMR assignments}, abstract = {Src Homology 2 and 3 (SH2 and SH3) are two key protein interaction modules involved in regulating the activity of many proteins such as tyrosine kinases and phosphatases by respective recognition of phosphotyrosine and proline-rich regions. In the Src family kinases, the inactive state of the protein is the direct result of the interaction of the SH2 and the SH3 domain with intra-molecular regions, leading to a closed structure incompetent with substrate modification. Here, we report the 1H, 15N and 13C backbone- and side-chain chemical shift assignments of the partially deuterated Fyn SH3-SH2 domain and structural differences between tandem and single domains. The BMRB accession number is 27165.}, note = {DOI: 10.1007/s12104-017-9792-1}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src Homology 2 and 3 (SH2 and SH3) are two key protein interaction modules involved in regulating the activity of many proteins such as tyrosine kinases and phosphatases by respective recognition of phosphotyrosine and proline-rich regions. In the Src family kinases, the inactive state of the protein is the direct result of the interaction of the SH2 and the SH3 domain with intra-molecular regions, leading to a closed structure incompetent with substrate modification. Here, we report the 1H, 15N and 13C backbone- and side-chain chemical shift assignments of the partially deuterated Fyn SH3-SH2 domain and structural differences between tandem and single domains. The BMRB accession number is 27165. |
Gazzo, Andrea; Raimondi, Daniele; Daneels, Dorien; Moreau, Yves; Smits, Guillaume; Dooren, Sonia Van; Lenaerts, Tom Understanding mutational effects in digenic diseases Journal Article In: Nucleic acids research, 45 (15), 2017, (DOI: 10.1093/nar/gkx557). @article{info:hdl:2013/261811, title = {Understanding mutational effects in digenic diseases}, author = {Andrea Gazzo and Daniele Raimondi and Dorien Daneels and Yves Moreau and Guillaume Smits and Sonia Van Dooren and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/261811}, year = {2017}, date = {2017-01-01}, journal = {Nucleic acids research}, volume = {45}, number = {15}, abstract = {To further our understanding of the complexity and genetic heterogeneity of rare diseases, it has become essential to shed light on how combinations of variants in different genes are responsible for a disease phenotype. With the appearance of a resource on digenic diseases, it has become possible to evaluate how digenic combinations differ in terms of the phenotypes they produce. All instances in this resource were assigned to two classes of digenic effects, annotated as true digenic and composite classes. Whereas in the true digenic class variants in both genes are required for developing the disease, in the composite class, a variant in one gene is sufficient to produce the phenotype, but an additional variant in a second gene impacts the disease phenotype or alters the age of onset. We show that a combination of variant, gene and higher-level features can differentiate between these two classes with high accuracy. Moreover, we show via the analysis of three digenic disorders that a digenic effect decision profile, extracted from the predictive model, motivates why an instance was assigned to either of the two classes. Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm.}, note = {DOI: 10.1093/nar/gkx557}, keywords = {}, pubstate = {published}, tppubtype = {article} } To further our understanding of the complexity and genetic heterogeneity of rare diseases, it has become essential to shed light on how combinations of variants in different genes are responsible for a disease phenotype. With the appearance of a resource on digenic diseases, it has become possible to evaluate how digenic combinations differ in terms of the phenotypes they produce. All instances in this resource were assigned to two classes of digenic effects, annotated as true digenic and composite classes. Whereas in the true digenic class variants in both genes are required for developing the disease, in the composite class, a variant in one gene is sufficient to produce the phenotype, but an additional variant in a second gene impacts the disease phenotype or alters the age of onset. We show that a combination of variant, gene and higher-level features can differentiate between these two classes with high accuracy. Moreover, we show via the analysis of three digenic disorders that a digenic effect decision profile, extracted from the predictive model, motivates why an instance was assigned to either of the two classes. Together, our results show that digenic disease data generates novel insights, providing a glimpse into the oligogenic realm. |
Raimondi, Daniele; cc, Ibrahim Tanyal; Ferte, Julien; Gazzo, Andrea; Orlando, Gabriele; Lenaerts, Tom; Rooman, Marianne; Vranken, Wim F DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins. Journal Article In: Nucleic acids research, 45 (W1), pp. W201-W206, 2017, (DOI: 10.1093/nar/gkx390). @article{info:hdl:2013/254250, title = {DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins.}, author = {Daniele Raimondi and Ibrahim Tanyal{cc}in and Julien Ferte and Andrea Gazzo and Gabriele Orlando and Tom Lenaerts and Marianne Rooman and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/254250}, year = {2017}, date = {2017-01-01}, journal = {Nucleic acids research}, volume = {45}, number = {W1}, pages = {W201-W206}, abstract = {High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual human genome contains millions of Single Nucleotide Variants: to discriminate the deleterious from the benign ones, a variety of methods have been developed that predict whether a protein-coding variant likely affects the carrier individual's health. We present such a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates. This extensive contextual information is non-linearly mapped into one single deleteriousness score for each variant. Since for the non-expert user it is sometimes still difficult to assess what this score means, how it relates to the encoded protein, and where it originates from, we developed an interactive online framework (http://deogen2.mutaframe.com/) to better present the DEOGEN2 deleteriousness predictions of all possible variants in all human proteins. The prediction is visualized so both expert and non-expert users can gain insights into the meaning, protein context and origins of each prediction.}, note = {DOI: 10.1093/nar/gkx390}, keywords = {}, pubstate = {published}, tppubtype = {article} } High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual human genome contains millions of Single Nucleotide Variants: to discriminate the deleterious from the benign ones, a variety of methods have been developed that predict whether a protein-coding variant likely affects the carrier individual's health. We present such a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates. This extensive contextual information is non-linearly mapped into one single deleteriousness score for each variant. Since for the non-expert user it is sometimes still difficult to assess what this score means, how it relates to the encoded protein, and where it originates from, we developed an interactive online framework (http://deogen2.mutaframe.com/) to better present the DEOGEN2 deleteriousness predictions of all possible variants in all human proteins. The prediction is visualized so both expert and non-expert users can gain insights into the meaning, protein context and origins of each prediction. |
Byrski, Aleksander; Świderska, Ewelina; Łasisz, Jakub; Kisiel-Dorohinicki, Marek; Lenaerts, Tom; Samson, Dana; Indurkhya, Bipin; Nowe, Ann Socio-cognitively inspired ant colony optimization Journal Article In: Journal of Computational Science, 21 , pp. 397-406, 2017, (DOI: 10.1016/j.jocs.2016.10.010). @article{info:hdl:2013/256822, title = {Socio-cognitively inspired ant colony optimization}, author = {Aleksander Byrski and Ewelina Świderska and Jakub Łasisz and Marek Kisiel-Dorohinicki and Tom Lenaerts and Dana Samson and Bipin Indurkhya and Ann Nowe}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/256822/1/Elsevier_240449.pdf}, year = {2017}, date = {2017-01-01}, journal = {Journal of Computational Science}, volume = {21}, pages = {397-406}, abstract = {Recently we proposed an application of ant colony optimization (ACO) to simulate socio-cognitive features of a population, incorporating perspective-taking ability to generate differently acting ant colonies. Although our main goal was simulation, we took advantage of the fact that the quality of the constructed system was evaluated based on selected traveling salesman problem instances, and the resulting computing system became a metaheuristic, which turned out to be a promising method for solving discrete problems. In this paper, we extend the initial sets of populations driven by different perspective-taking inspirations, seeking both optimal configuration for solving a number of TSP benchmarks, at the same time constituting a tool for analyzing socio-cognitive features of the individuals involved. The proposed algorithms are compared against classic ACO, and are found to prevail in most of the benchmark functions tested.}, note = {DOI: 10.1016/j.jocs.2016.10.010}, keywords = {}, pubstate = {published}, tppubtype = {article} } Recently we proposed an application of ant colony optimization (ACO) to simulate socio-cognitive features of a population, incorporating perspective-taking ability to generate differently acting ant colonies. Although our main goal was simulation, we took advantage of the fact that the quality of the constructed system was evaluated based on selected traveling salesman problem instances, and the resulting computing system became a metaheuristic, which turned out to be a promising method for solving discrete problems. In this paper, we extend the initial sets of populations driven by different perspective-taking inspirations, seeking both optimal configuration for solving a number of TSP benchmarks, at the same time constituting a tool for analyzing socio-cognitive features of the individuals involved. The proposed algorithms are compared against classic ACO, and are found to prevail in most of the benchmark functions tested. |
Reggiani, Claudio; Coppens, Sandra; Sekhara, Tayeb; Dimov, Ivan; Pichon, Bruno; Lufin, Nicolas; Addor, Marie Claude; Belligni, Elga Fabia; Digilio, Maria Cristina; Faletra, Flavio; Ferrero, Giovanni Battista; Gerard, Marion; Isidor, Bertrand; Joss, Shelagh; Niel-Bütschi, Florence; Perrone, Maria Dolores; Petit, Florence; Renieri, Alessandra; Romana, Serge; Topa, Alexandra; Vermeesch, Joris Robert; Lenaerts, Tom; Casimir, Georges; Abramowicz, Marc; Bontempi, Gianluca; Vilain, Catheline; Deconinck, Nicolas; Smits, Guillaume Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability. Journal Article In: Genome medicine, 9 (1), pp. 67, 2017, (DOI: 10.1186/s13073-017-0452-y). @article{info:hdl:2013/258564, title = {Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability.}, author = {Claudio Reggiani and Sandra Coppens and Tayeb Sekhara and Ivan Dimov and Bruno Pichon and Nicolas Lufin and Marie Claude Addor and Elga Fabia Belligni and Maria Cristina Digilio and Flavio Faletra and Giovanni Battista Ferrero and Marion Gerard and Bertrand Isidor and Shelagh Joss and Florence Niel-Bütschi and Maria Dolores Perrone and Florence Petit and Alessandra Renieri and Serge Romana and Alexandra Topa and Joris Robert Vermeesch and Tom Lenaerts and Georges Casimir and Marc Abramowicz and Gianluca Bontempi and Catheline Vilain and Nicolas Deconinck and Guillaume Smits}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/258564/1/PMC5518101.pdf}, year = {2017}, date = {2017-01-01}, journal = {Genome medicine}, volume = {9}, number = {1}, pages = {67}, abstract = {Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders.}, note = {DOI: 10.1186/s13073-017-0452-y}, keywords = {}, pubstate = {published}, tppubtype = {article} } Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders. |
Cava, Claudia; Colaprico, Antonio; Bertoli, Gloria; Graudenzi, Alex; Silva, Tiago C; Olsen, Catharina; Noushmehr, Houtan; Bontempi, Gianluca; Mauri, Giancarlo; Castiglioni, Isabella SpidermiR: an R/Bioconductor package for integrative analysis with miRNA data Journal Article In: International journal of molecular sciences, 2017, (DOI: 10.3390/ijms18020274). @article{info:hdl:2013/245105, title = {SpidermiR: an R/Bioconductor package for integrative analysis with miRNA data}, author = {Claudia Cava and Antonio Colaprico and Gloria Bertoli and Alex Graudenzi and Tiago C Silva and Catharina Olsen and Houtan Noushmehr and Gianluca Bontempi and Giancarlo Mauri and Isabella Castiglioni}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/245105}, year = {2017}, date = {2017-01-01}, journal = {International journal of molecular sciences}, abstract = {Gene Regulatory Networks (GRNs) control many biological systems, but how such network coordination is shaped is still unknown. GRNs can be subdivided into basic connections that describe how the network members interact e.g., co-expression, physical interaction, co-localization, genetic influence, pathways, and shared protein domains. The important regulatory mechanisms of these networks involve miRNAs. We developed an R/Bioconductor package, namely SpidermiR, which offers an easy access to both GRNs and miRNAs to the end user, and integrates this information with differentially expressed genes obtained from The Cancer Genome Atlas. Specifically, SpidermiR allows the users to: (i) query and download GRNs and miRNAs from validated and predicted repositories; (ii) integrate miRNAs with GRNs in order to obtain miRNA–gene–gene and miRNA–protein–protein interactions, and to analyze miRNA GRNs in order to identify miRNA–gene communities; and (iii) graphically visualize the results of the analyses. These analyses can be performed through a single interface and without the need for any downloads. The full data sets are then rapidly integrated and processed locally.}, note = {DOI: 10.3390/ijms18020274}, keywords = {}, pubstate = {published}, tppubtype = {article} } Gene Regulatory Networks (GRNs) control many biological systems, but how such network coordination is shaped is still unknown. GRNs can be subdivided into basic connections that describe how the network members interact e.g., co-expression, physical interaction, co-localization, genetic influence, pathways, and shared protein domains. The important regulatory mechanisms of these networks involve miRNAs. We developed an R/Bioconductor package, namely SpidermiR, which offers an easy access to both GRNs and miRNAs to the end user, and integrates this information with differentially expressed genes obtained from The Cancer Genome Atlas. Specifically, SpidermiR allows the users to: (i) query and download GRNs and miRNAs from validated and predicted repositories; (ii) integrate miRNAs with GRNs in order to obtain miRNA–gene–gene and miRNA–protein–protein interactions, and to analyze miRNA GRNs in order to identify miRNA–gene communities; and (iii) graphically visualize the results of the analyses. These analyses can be performed through a single interface and without the need for any downloads. The full data sets are then rapidly integrated and processed locally. |
Orlando, Gabriele; Raimondi, Daniele; Khanna, T; Lenaerts, Tom; Vranken, Wim F SVM-dependent pairwise HMM: an application to Protein pairwise alignments. Journal Article In: Bioinformatics, 2017, (DOI: 10.1093/bioinformatics/btx391). @article{info:hdl:2013/254251, title = {SVM-dependent pairwise HMM: an application to Protein pairwise alignments.}, author = {Gabriele Orlando and Daniele Raimondi and T Khanna and Tom Lenaerts and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/254251}, year = {2017}, date = {2017-01-01}, journal = {Bioinformatics}, abstract = {Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions.}, note = {DOI: 10.1093/bioinformatics/btx391}, keywords = {}, pubstate = {published}, tppubtype = {article} } Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. |
Carcillo, Fabrizio; Pozzolo, Andrea Dal; "e, Yann-A; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection with Spark Journal Article In: Information fusion, 2017, (DOI: 10.1016/j.inffus.2017.09.005). @article{info:hdl:2013/258226, title = {SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection with Spark}, author = {Fabrizio Carcillo and Andrea Dal Pozzolo and Yann-A{"e}l Le Borgne and Olivier Caelen and Yannis Mazzer and Gianluca Bontempi}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/258226/4/Elsevier_241853.pdf}, year = {2017}, date = {2017-01-01}, journal = {Information fusion}, abstract = {The expansion of the electronic commerce, together with an increasing confidence of customers in electronic payments, makes of fraud detection a critical factor. Detecting frauds in (nearly) real time setting demands the design and the implementation of scalable learning techniques able to ingest and analyse massive amounts of streaming data. Recent advances in analytics and the availability of open source solutions for Big Data storage and processing open new perspectives to the fraud detection field. In this paper we present a SCAlable Real-time Fraud Finder (SCARFF) which integrates Big Data tools (Kafka, Spark and Cassandra) with a machine learning approach which deals with imbalance, nonstationarity and feedback latency. Experimental results on a massive dataset of real credit card transactions show that this framework is scalable, efficient and accurate over a big stream of transactions.}, note = {DOI: 10.1016/j.inffus.2017.09.005}, keywords = {}, pubstate = {published}, tppubtype = {article} } The expansion of the electronic commerce, together with an increasing confidence of customers in electronic payments, makes of fraud detection a critical factor. Detecting frauds in (nearly) real time setting demands the design and the implementation of scalable learning techniques able to ingest and analyse massive amounts of streaming data. Recent advances in analytics and the availability of open source solutions for Big Data storage and processing open new perspectives to the fraud detection field. In this paper we present a SCAlable Real-time Fraud Finder (SCARFF) which integrates Big Data tools (Kafka, Spark and Cassandra) with a machine learning approach which deals with imbalance, nonstationarity and feedback latency. Experimental results on a massive dataset of real credit card transactions show that this framework is scalable, efficient and accurate over a big stream of transactions. |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom When agreement-accepting free-riders are a necessary evil for the evolution of cooperation. Journal Article In: Scientific reports, 7 (1), pp. 2478, 2017, (DOI: 10.1038/s41598-017-02625-z). @article{info:hdl:2013/256610, title = {When agreement-accepting free-riders are a necessary evil for the evolution of cooperation.}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/256610/1/PMC5449399.pdf}, year = {2017}, date = {2017-01-01}, journal = {Scientific reports}, volume = {7}, number = {1}, pages = {2478}, abstract = {Agreements and commitments have provided a novel mechanism to promote cooperation in social dilemmas in both one-shot and repeated games. Individuals requesting others to commit to cooperate (proposers) incur a cost, while their co-players are not necessarily required to pay any, allowing them to free-ride on the proposal investment cost (acceptors). Although there is a clear complementarity in these behaviours, no dynamic evidence is currently available that proves that they coexist in different forms of commitment creation. Using a stochastic evolutionary model allowing for mixed population states, we identify non-trivial roles of acceptors as well as the importance of intention recognition in commitments. In the one-shot prisoner's dilemma, alliances between proposers and acceptors are necessary to isolate defectors when proposers do not know the acceptance intentions of the others. However, when the intentions are clear beforehand, the proposers can emerge by themselves. In repeated games with noise, the incapacity of proposers and acceptors to set up alliances makes the emergence of the first harder whenever the latter are present. As a result, acceptors will exploit proposers and take over the population when an apology-forgiveness mechanism with too low apology cost is introduced, and hence reduce the overall cooperation level.}, note = {DOI: 10.1038/s41598-017-02625-z}, keywords = {}, pubstate = {published}, tppubtype = {article} } Agreements and commitments have provided a novel mechanism to promote cooperation in social dilemmas in both one-shot and repeated games. Individuals requesting others to commit to cooperate (proposers) incur a cost, while their co-players are not necessarily required to pay any, allowing them to free-ride on the proposal investment cost (acceptors). Although there is a clear complementarity in these behaviours, no dynamic evidence is currently available that proves that they coexist in different forms of commitment creation. Using a stochastic evolutionary model allowing for mixed population states, we identify non-trivial roles of acceptors as well as the importance of intention recognition in commitments. In the one-shot prisoner's dilemma, alliances between proposers and acceptors are necessary to isolate defectors when proposers do not know the acceptance intentions of the others. However, when the intentions are clear beforehand, the proposers can emerge by themselves. In repeated games with noise, the incapacity of proposers and acceptors to set up alliances makes the emergence of the first harder whenever the latter are present. As a result, acceptors will exploit proposers and take over the population when an apology-forgiveness mechanism with too low apology cost is introduced, and hence reduce the overall cooperation level. |
Pereira, Luís Moniz; Lenaerts, Tom; Martinez-Vaquero, Luis L A; Han, The Anh T A H Social manifestation of guilt leads to stable cooperation in multi-agent systems Journal Article In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 3 , pp. 1421-1430, 2017, (Language of publication: en). @article{info:hdl:2013/271395, title = {Social manifestation of guilt leads to stable cooperation in multi-agent systems}, author = {Luís Moniz Pereira and Tom Lenaerts and Luis L A Martinez-Vaquero and The Anh T A H Han}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/271395}, year = {2017}, date = {2017-01-01}, journal = {Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS}, volume = {3}, pages = {1421-1430}, abstract = {Inspired by psychological and evolutionary studies, we present here theoretical models wherein agents have the potential to express guilt with the ambition to study the role of this emotion in the promotion of pro-social behaviour. To achieve this goal, analytical and numerical methods from evolutionary game theory are employed to identify the conditions for which enhanced cooperation emerges within the context of the iterated prisoners dilemma. Guilt is modelled explicitly as two features, i.e. A counter that keeps track of the number of transgressions and a threshold that dictates when alleviation (through for instance apology and self-punishment) is required for an emotional agent. Such an alleviation introduces an effect on the payoff of the agent experiencing guilt. We show that when the system consists of agents that resolve their guilt without considering the co-player's attitude towards guilt alleviation then cooperation does not emerge. In that case those guilt prone agents are easily dominated by agents expressing no guilt or having no incentive to alleviate the guilt they experience. When, on the other hand, the guilt prone focal agent requires that guilt only needs to be alleviated when guilt alleviation is also manifested by a defecting co-player, then cooperation may thrive. This observation remains consistent for a generalised model as is discussed in this article. In summary, our analysis provides important insights into the design of multi-agent and cognitive agent systems where the inclusion of guilt modelling can improve agents' cooperative behaviour and overall benefit.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } Inspired by psychological and evolutionary studies, we present here theoretical models wherein agents have the potential to express guilt with the ambition to study the role of this emotion in the promotion of pro-social behaviour. To achieve this goal, analytical and numerical methods from evolutionary game theory are employed to identify the conditions for which enhanced cooperation emerges within the context of the iterated prisoners dilemma. Guilt is modelled explicitly as two features, i.e. A counter that keeps track of the number of transgressions and a threshold that dictates when alleviation (through for instance apology and self-punishment) is required for an emotional agent. Such an alleviation introduces an effect on the payoff of the agent experiencing guilt. We show that when the system consists of agents that resolve their guilt without considering the co-player's attitude towards guilt alleviation then cooperation does not emerge. In that case those guilt prone agents are easily dominated by agents expressing no guilt or having no incentive to alleviate the guilt they experience. When, on the other hand, the guilt prone focal agent requires that guilt only needs to be alleviated when guilt alleviation is also manifested by a defecting co-player, then cooperation may thrive. This observation remains consistent for a generalised model as is discussed in this article. In summary, our analysis provides important insights into the design of multi-agent and cognitive agent systems where the inclusion of guilt modelling can improve agents' cooperative behaviour and overall benefit. |
Brown, David Norman 2017, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/260251, title = {Application of phylogenetic inference methods to quantify intra-tumour heterogeneity and evolution of breast cancers}, author = {David Norman Brown}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/260251/6/ContratDavidBrown.pdf}, year = {2017}, date = {2017-01-01}, abstract = {Cancer related mortality is almost always due to metastatic dissemination of the primary disease. While research into the biological mechanisms that drive the metastatic cascade continues to unravel its molecular underpinnings, progress in our understanding of biological phenomena such as tumour heterogeneity and its relevance to the origins of distant recurrence or the emergence of resistance to therapy has been limited.In parallel to major breakthroughs in the development of high throughput molecular techniques, researchers have begun to utilise next generation sequencing to explore the relationship between primary and matched metastatic tumours in diverse types of neoplasia. Despite small cohort sizes and often, a limited number of matched metastases for each patient, pioneering studies have uncovered hitherto unknown biological processes such as the occurrence of organ specific metastatic lineages, polyclonal seeding and homing of metastatic cells to the primary tumour bed. While yet other studies continue to highlight the potential of genomic analyses, at the time this thesis was started, an in-depth knowledge of disease progression and metastatic dissemination was currently lacking in breast cancers.Herein, we employed phylogenetic inference methods to investigate intra-tumour heterogeneity and evolution of breast cancers. A combination of whole exome sequencing, custom ultra-deep resequencing and copy number profiling were applied to primary tumours and their associated metastases from ten autopsied breast cancer patients. Two modes of metastatic progression were observed. In the majority of cases, all distant metastases clustered on a branch separate from their primary lesion. Clonal frequency analysis of somatic mutations showed that the metastases had a monoclonal origin and descended from a common metastatic precursor. Alternatively, the primary tumour was clustered alongside metastases with early branches leading to distant organs. This dichotomy coincided with the clinical history of the patients whereby multiple seeding events from the primary tumour alongside cascading metastasis-to-metastasis disseminations occurred in treatment na"ive de novo metastatic patients, whereas descent from a common metastatic precursor was observed in patients who underwent primary surgery followed by systemic treatment. The data also showed that a distant metastasis can be horizontally cross-seeded and finally revealed a correlation between the extent of somatic point mutations private to the distant lesions and patient overall survival. In an unrelated dataset of relapsed breast cancer patients with matched primary and distant lesions profiled using whole genome sequencing, the landscape of somatic alterations confirmed the time dependency of copy number aberrations implying that cancer phylogenies can be dated using a molecular clock.The work presented here harnesses the strength of high throughput genomic techniques and state of the art phylogenetic tools to tell the evolutionary history of breast cancers. Our results show that the linear and parallel models of metastatic dissemination which have been held as near doctrines for many years are overstated point of views of cancer progression. Beyond the biological insights, these results suggest that surgical excision of the primary tumour in de novo metastatic breast cancers might reduce dissemination in selected cases hence providing a potential biological rationale for this practice. Similarly, there is no strong evidence of benefit in overall survival from surgical resection of oligo-metastases in breast cancer. From our analyses, metastatic lesions constitute an additional source of seeding and heterogeneity in advanced breast cancer. The data presented here is too small to derive practice-changing evidence, but supports the concept that resecting isolated metastases may be of clinical benefit in oligo-metastatic breast cancer patients. In both cases, results from larger prospective studies are warranted.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Cancer related mortality is almost always due to metastatic dissemination of the primary disease. While research into the biological mechanisms that drive the metastatic cascade continues to unravel its molecular underpinnings, progress in our understanding of biological phenomena such as tumour heterogeneity and its relevance to the origins of distant recurrence or the emergence of resistance to therapy has been limited.In parallel to major breakthroughs in the development of high throughput molecular techniques, researchers have begun to utilise next generation sequencing to explore the relationship between primary and matched metastatic tumours in diverse types of neoplasia. Despite small cohort sizes and often, a limited number of matched metastases for each patient, pioneering studies have uncovered hitherto unknown biological processes such as the occurrence of organ specific metastatic lineages, polyclonal seeding and homing of metastatic cells to the primary tumour bed. While yet other studies continue to highlight the potential of genomic analyses, at the time this thesis was started, an in-depth knowledge of disease progression and metastatic dissemination was currently lacking in breast cancers.Herein, we employed phylogenetic inference methods to investigate intra-tumour heterogeneity and evolution of breast cancers. A combination of whole exome sequencing, custom ultra-deep resequencing and copy number profiling were applied to primary tumours and their associated metastases from ten autopsied breast cancer patients. Two modes of metastatic progression were observed. In the majority of cases, all distant metastases clustered on a branch separate from their primary lesion. Clonal frequency analysis of somatic mutations showed that the metastases had a monoclonal origin and descended from a common metastatic precursor. Alternatively, the primary tumour was clustered alongside metastases with early branches leading to distant organs. This dichotomy coincided with the clinical history of the patients whereby multiple seeding events from the primary tumour alongside cascading metastasis-to-metastasis disseminations occurred in treatment na"ive de novo metastatic patients, whereas descent from a common metastatic precursor was observed in patients who underwent primary surgery followed by systemic treatment. The data also showed that a distant metastasis can be horizontally cross-seeded and finally revealed a correlation between the extent of somatic point mutations private to the distant lesions and patient overall survival. In an unrelated dataset of relapsed breast cancer patients with matched primary and distant lesions profiled using whole genome sequencing, the landscape of somatic alterations confirmed the time dependency of copy number aberrations implying that cancer phylogenies can be dated using a molecular clock.The work presented here harnesses the strength of high throughput genomic techniques and state of the art phylogenetic tools to tell the evolutionary history of breast cancers. Our results show that the linear and parallel models of metastatic dissemination which have been held as near doctrines for many years are overstated point of views of cancer progression. Beyond the biological insights, these results suggest that surgical excision of the primary tumour in de novo metastatic breast cancers might reduce dissemination in selected cases hence providing a potential biological rationale for this practice. Similarly, there is no strong evidence of benefit in overall survival from surgical resection of oligo-metastases in breast cancer. From our analyses, metastatic lesions constitute an additional source of seeding and heterogeneity in advanced breast cancer. The data presented here is too small to derive practice-changing evidence, but supports the concept that resecting isolated metastases may be of clinical benefit in oligo-metastatic breast cancer patients. In both cases, results from larger prospective studies are warranted. |
Jeschke, Jana; Bizet, Martin; Desmedt, Christine; Calonne, Emilie; Dedeurwaerder, Sarah; Garaud, Soizic; Koch, Alexander K; Larsimont, Denis; Salgado, Roberto; den Eynden, Gert Van; Willard-Gallo, Karen; Bontempi, Gianluca; Defrance, Matthieu; Sotiriou, Christos; cc, Fran DNA methylation-based immune response signature improves patient diagnosis in multiple cancers. Journal Article In: The Journal of clinical investigation, 127 (8), pp. 3090-3102, 2017, (DOI: 10.1172/JCI91095). @article{info:hdl:2013/265312, title = {DNA methylation-based immune response signature improves patient diagnosis in multiple cancers.}, author = {Jana Jeschke and Martin Bizet and Christine Desmedt and Emilie Calonne and Sarah Dedeurwaerder and Soizic Garaud and Alexander K Koch and Denis Larsimont and Roberto Salgado and Gert Van den Eynden and Karen Willard-Gallo and Gianluca Bontempi and Matthieu Defrance and Christos Sotiriou and Fran{cc}ois Fuks}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/265312/1/PMC5531413.pdf}, year = {2017}, date = {2017-01-01}, journal = {The Journal of clinical investigation}, volume = {127}, number = {8}, pages = {3090-3102}, abstract = {The tumor immune response is increasingly associated with better clinical outcomes in breast and other cancers. However, the evaluation of tumor-infiltrating lymphocytes (TILs) relies on histopathological measurements with limited accuracy and reproducibility. Here, we profiled DNA methylation markers to identify a methylation of TIL (MeTIL) signature that recapitulates TIL evaluations and their prognostic value for long-term outcomes in breast cancer (BC).}, note = {DOI: 10.1172/JCI91095}, keywords = {}, pubstate = {published}, tppubtype = {article} } The tumor immune response is increasingly associated with better clinical outcomes in breast and other cancers. However, the evaluation of tumor-infiltrating lymphocytes (TILs) relies on histopathological measurements with limited accuracy and reproducibility. Here, we profiled DNA methylation markers to identify a methylation of TIL (MeTIL) signature that recapitulates TIL evaluations and their prognostic value for long-term outcomes in breast cancer (BC). |
Reggiani, Claudio; Coppens, Sandra; Sekhara, Tayeb; Dimov, Ivan; Pichon, Bruno; Lufin, Nicolas; Addor, Marie Claude; Belligni, Elga Fabia; Digilio, Maria Cristina; Faletra, Flavio; Ferrero, Giovanni Battista; Gerard, Marion; Isidor, Bertrand; Joss, Shelagh; Niel-Bütschi, Florence; Perrone, Maria Dolores; Petit, Florence; Renieri, Alessandra; Romana, Serge; Topa, Alexandra; Vermeesch, Joris Robert; Lenaerts, Tom; Casimir, Georges; Abramowicz, Marc; Bontempi, Gianluca; Vilain, Catheline; Deconinck, Nicolas; Smits, Guillaume Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability. Journal Article In: Genome medicine, 9 (1), pp. 67, 2017, (DOI: 10.1186/s13073-017-0452-y). @article{info:hdl:2013/258564b, title = {Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability.}, author = {Claudio Reggiani and Sandra Coppens and Tayeb Sekhara and Ivan Dimov and Bruno Pichon and Nicolas Lufin and Marie Claude Addor and Elga Fabia Belligni and Maria Cristina Digilio and Flavio Faletra and Giovanni Battista Ferrero and Marion Gerard and Bertrand Isidor and Shelagh Joss and Florence Niel-Bütschi and Maria Dolores Perrone and Florence Petit and Alessandra Renieri and Serge Romana and Alexandra Topa and Joris Robert Vermeesch and Tom Lenaerts and Georges Casimir and Marc Abramowicz and Gianluca Bontempi and Catheline Vilain and Nicolas Deconinck and Guillaume Smits}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/258564/1/PMC5518101.pdf}, year = {2017}, date = {2017-01-01}, journal = {Genome medicine}, volume = {9}, number = {1}, pages = {67}, abstract = {Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders.}, note = {DOI: 10.1186/s13073-017-0452-y}, keywords = {}, pubstate = {published}, tppubtype = {article} } Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders. |
Pham, Ngoc Cam; Haibe-Kains, Benjamin; Bellot, Pau; Bontempi, Gianluca; Meyer, Patrick E Study of Meta-analysis strategies for network inference using information-theoretic approaches Journal Article In: BioData Mining, 10 (1), 2017, (DOI: 10.1186/s13040-017-0136-6). @article{info:hdl:2013/259607, title = {Study of Meta-analysis strategies for network inference using information-theoretic approaches}, author = {Ngoc Cam Pham and Benjamin Haibe-Kains and Pau Bellot and Gianluca Bontempi and Patrick E Meyer}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/259607}, year = {2017}, date = {2017-01-01}, journal = {BioData Mining}, volume = {10}, number = {1}, abstract = {Background: Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches, which suffer from experimental biases and the low number of samples by analysing individual datasets. To date, there are mainly two strategies for the problem of interest: the first one (“data merging”) merges all datasets together and then infers a GRN whereas the other (“networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking. Results: In this work, we are going to present another meta-analysis approach for inferring GRNs from multiple studies. Our proposed meta-analysis approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix. Afterwards, we evaluate the performance of the two commonly used approaches mentioned above and our presented approach with a systematic set of experiments based on in silico benchmarks. Conclusions: We proposed a first systematic evaluation of different strategies for reverse engineering GRNs from multiple datasets. Experiment results strongly suggest that assembling matrices of pairwise dependencies is a better strategy for network inference than the two commonly used ones.}, note = {DOI: 10.1186/s13040-017-0136-6}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background: Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches, which suffer from experimental biases and the low number of samples by analysing individual datasets. To date, there are mainly two strategies for the problem of interest: the first one (“data merging”) merges all datasets together and then infers a GRN whereas the other (“networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking. Results: In this work, we are going to present another meta-analysis approach for inferring GRNs from multiple studies. Our proposed meta-analysis approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix. Afterwards, we evaluate the performance of the two commonly used approaches mentioned above and our presented approach with a systematic set of experiments based on in silico benchmarks. Conclusions: We proposed a first systematic evaluation of different strategies for reverse engineering GRNs from multiple datasets. Experiment results strongly suggest that assembling matrices of pairwise dependencies is a better strategy for network inference than the two commonly used ones. |
Pham, Ngoc Cam; Haibe-Kains, Benjamin; Bellot, Pau; Bontempi, Gianluca; Meyer, Patrick E Study of Meta-analysis Strategies for Network Inference Using Information-Theoretic Approaches Journal Article In: Proceedings - International Workshop on Database and Expert Systems Applications, pp. 76-83, 2017, (DOI: 10.1109/DEXA.2016.030). @article{info:hdl:2013/247710, title = {Study of Meta-analysis Strategies for Network Inference Using Information-Theoretic Approaches}, author = {Ngoc Cam Pham and Benjamin Haibe-Kains and Pau Bellot and Gianluca Bontempi and Patrick E Meyer}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/247710}, year = {2017}, date = {2017-01-01}, journal = {Proceedings - International Workshop on Database and Expert Systems Applications}, pages = {76-83}, abstract = {Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge insystems biology. Thanks to high-throughput technologies, amassive amount of gene-expression data has been accumulatedin the public repositories. Modelling GRNs from multipleexperiments (also called integrative analysis) has, therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robustthan the traditional approaches focused on individual datasets, which typically suffer from some experimental bias and a smallnumber of samples. To date, there are mainly two strategies for the problemof interest: the first one ('data merging') merges all datasetstogether and then infers a GRN whereas the other ('networksensemble') infers GRNs from every dataset separately and thenaggregates them using some ensemble rules (such as ranksumor weightsum). Unfortunately, a thorough comparison of thesetwo approaches is lacking. In this paper, we evaluate the performances of various metaanalysis approaches mentioned above with a systematic set ofexperiments based on in silico benchmarks. Furthermore, wepresent a new meta-analysis approach for inferring GRNs frommultiple studies. Our proposed approach, adapted to methodsbased on pairwise measures such as correlation or mutualinformation, consists of two steps: aggregating matrices of thepairwise measures from every dataset followed by extractingthe network from the meta-matrix.}, note = {DOI: 10.1109/DEXA.2016.030}, keywords = {}, pubstate = {published}, tppubtype = {article} } Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge insystems biology. Thanks to high-throughput technologies, amassive amount of gene-expression data has been accumulatedin the public repositories. Modelling GRNs from multipleexperiments (also called integrative analysis) has, therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robustthan the traditional approaches focused on individual datasets, which typically suffer from some experimental bias and a smallnumber of samples. To date, there are mainly two strategies for the problemof interest: the first one ('data merging') merges all datasetstogether and then infers a GRN whereas the other ('networksensemble') infers GRNs from every dataset separately and thenaggregates them using some ensemble rules (such as ranksumor weightsum). Unfortunately, a thorough comparison of thesetwo approaches is lacking. In this paper, we evaluate the performances of various metaanalysis approaches mentioned above with a systematic set ofexperiments based on in silico benchmarks. Furthermore, wepresent a new meta-analysis approach for inferring GRNs frommultiple studies. Our proposed approach, adapted to methodsbased on pairwise measures such as correlation or mutualinformation, consists of two steps: aggregating matrices of thepairwise measures from every dataset followed by extractingthe network from the meta-matrix. |
Pozzolo, Andrea Dal; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy Journal Article In: IEEE Transactions on Neural Networks and Learning Systems, 99 , 2017, (DOI: 10.1109/TNNLS.2017.2736643). @article{info:hdl:2013/258224, title = {Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy}, author = {Andrea Dal Pozzolo and Giacomo Boracchi and Olivier Caelen and Cesare Alippi and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/258224}, year = {2017}, date = {2017-01-01}, journal = {IEEE Transactions on Neural Networks and Learning Systems}, volume = {99}, abstract = {Detecting frauds in credit card transactions is perhaps one of the best testbeds for computational intelligence algorithms. In fact, this problem involves a number of relevant challenges, namely: concept drift (customers' habits evolve and fraudsters change their strategies over time), class imbalance (genuine transactions far outnumber frauds), and verification latency (only a small set of transactions are timely checked by investigators). However, the vast majority of learning algorithms that have been proposed for fraud detection rely on assumptions that hardly hold in a real-world fraud-detection system (FDS). This lack of realism concerns two main aspects: 1) the way and timing with which supervised information is provided and 2) the measures used to assess fraud-detection performance. This paper has three major contributions. First, we propose, with the help of our industrial partner, a formalization of the fraud-detection problem that realistically describes the operating conditions of FDSs that everyday analyze massive streams of credit card transactions. We also illustrate the most appropriate performance measures to be used for fraud-detection purposes. Second, we design and assess a novel learning strategy that effectively addresses class imbalance, concept drift, and verification latency. Third, in our experiments, we demonstrate the impact of class unbalance and concept drift in a real-world data stream containing more than 75 million transactions, authorized over a time window of three years.}, note = {DOI: 10.1109/TNNLS.2017.2736643}, keywords = {}, pubstate = {published}, tppubtype = {article} } Detecting frauds in credit card transactions is perhaps one of the best testbeds for computational intelligence algorithms. In fact, this problem involves a number of relevant challenges, namely: concept drift (customers' habits evolve and fraudsters change their strategies over time), class imbalance (genuine transactions far outnumber frauds), and verification latency (only a small set of transactions are timely checked by investigators). However, the vast majority of learning algorithms that have been proposed for fraud detection rely on assumptions that hardly hold in a real-world fraud-detection system (FDS). This lack of realism concerns two main aspects: 1) the way and timing with which supervised information is provided and 2) the measures used to assess fraud-detection performance. This paper has three major contributions. First, we propose, with the help of our industrial partner, a formalization of the fraud-detection problem that realistically describes the operating conditions of FDSs that everyday analyze massive streams of credit card transactions. We also illustrate the most appropriate performance measures to be used for fraud-detection purposes. Second, we design and assess a novel learning strategy that effectively addresses class imbalance, concept drift, and verification latency. Third, in our experiments, we demonstrate the impact of class unbalance and concept drift in a real-world data stream containing more than 75 million transactions, authorized over a time window of three years. |
Xu, Taosheng; Le, Thuc Duy; Liu, Lin; Su, Ning; Wang, Rujing; Sun, Bingyu; Colaprico, Antonio; Bontempi, Gianluca; Li, Jiuyong CancerSubtypes: An R/Bioconductor package for molecular cancer subtype identification, validation and visualization Journal Article In: Bioinformatics, 33 (19), pp. 3131-3133, 2017, (DOI: 10.1093/bioinformatics/btx378). @article{info:hdl:2013/260704, title = {CancerSubtypes: An R/Bioconductor package for molecular cancer subtype identification, validation and visualization}, author = {Taosheng Xu and Thuc Duy Le and Lin Liu and Ning Su and Rujing Wang and Bingyu Sun and Antonio Colaprico and Gianluca Bontempi and Jiuyong Li}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/260704}, year = {2017}, date = {2017-01-01}, journal = {Bioinformatics}, volume = {33}, number = {19}, pages = {3131-3133}, abstract = {Summary Identifying molecular cancer subtypes from multi-omics data is an important step in the personalized medicine. We introduce CancerSubtypes, an R package for identifying cancer subtypes using multi-omics data, including gene expression, miRNA expression and DNA methylation data. CancerSubtypes integrates four main computational methods which are highly cited for cancer subtype identification and provides a standardized framework for data pre-processing, feature selection, and result follow-up analyses, including results computing, biology validation and visualization. The input and output of each step in the framework are packaged in the same data format, making it convenience to compare different methods. The package is useful for inferring cancer subtypes from an input genomic dataset, comparing the predictions from different well-known methods and testing new subtype discovery methods, as shown with different application scenarios in theSupplementary Material.}, note = {DOI: 10.1093/bioinformatics/btx378}, keywords = {}, pubstate = {published}, tppubtype = {article} } Summary Identifying molecular cancer subtypes from multi-omics data is an important step in the personalized medicine. We introduce CancerSubtypes, an R package for identifying cancer subtypes using multi-omics data, including gene expression, miRNA expression and DNA methylation data. CancerSubtypes integrates four main computational methods which are highly cited for cancer subtype identification and provides a standardized framework for data pre-processing, feature selection, and result follow-up analyses, including results computing, biology validation and visualization. The input and output of each step in the framework are packaged in the same data format, making it convenience to compare different methods. The package is useful for inferring cancer subtypes from an input genomic dataset, comparing the predictions from different well-known methods and testing new subtype discovery methods, as shown with different application scenarios in theSupplementary Material. |
Stefani, Jacopo De; Caelen, Olivier; Hattab, Dalila; Bontempi, Gianluca Machine learning for multi-step ahead forecasting of volatility proxies Journal Article In: CEUR Workshop Proceedings, 1941 , pp. 17-28, 2017, (Language of publication: en). @article{info:hdl:2013/261790, title = {Machine learning for multi-step ahead forecasting of volatility proxies}, author = {Jacopo De Stefani and Olivier Caelen and Dalila Hattab and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/261790}, year = {2017}, date = {2017-01-01}, journal = {CEUR Workshop Proceedings}, volume = {1941}, pages = {17-28}, abstract = {In finance, volatility is defined as a measure of variation of a trading price series over time. As volatility is a latent variable, several measures, named proxies, have been proposed in the literature to represent such quantity. The purpose of our work is twofold. On one hand, we aim to perform a statistical assessment of the relationships among the most used proxies in the volatility literature. On the other hand, while the majority of the reviewed studies in the literature focuses on a univariate time series model (NAR), using a single proxy, we propose here a NARX model, combining two proxies to predict one of them, showing that it is possible to improve the prediction of the future value of some proxies by using the information provided by the others. Our results, employing artificial neural networks (ANN), k-Nearest Neighbours (kNN) and support vector regression (SVR), show that the supplementary information carried by the additional proxy could be used to reduce the forecasting error of the aforementioned methods. We conclude by explaining how we wish to further investigate such relationship.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } In finance, volatility is defined as a measure of variation of a trading price series over time. As volatility is a latent variable, several measures, named proxies, have been proposed in the literature to represent such quantity. The purpose of our work is twofold. On one hand, we aim to perform a statistical assessment of the relationships among the most used proxies in the volatility literature. On the other hand, while the majority of the reviewed studies in the literature focuses on a univariate time series model (NAR), using a single proxy, we propose here a NARX model, combining two proxies to predict one of them, showing that it is possible to improve the prediction of the future value of some proxies by using the information provided by the others. Our results, employing artificial neural networks (ANN), k-Nearest Neighbours (kNN) and support vector regression (SVR), show that the supplementary information carried by the additional proxy could be used to reduce the forecasting error of the aforementioned methods. We conclude by explaining how we wish to further investigate such relationship. |
Raimondi, Daniele 2017, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/251313b, title = {The effect of genome variation on human proteins: understanding variants and improving their deleteriousness prediction through extensive contextualisation}, author = {Daniele Raimondi}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/251313/5/ContratDiRaimondi.pdf}, year = {2017}, date = {2017-01-01}, abstract = {Rapid technological advances are providing unprecedented insights in the biologicalsciences, with massive amounts of data generated on genomic and protein sequences.These data continue to grow exponentially, and they are extremely valuable for com-putational tools where the effect of genomic variants on human health is predicted.State of the art tools in this field give varying results and only tend to agree in thecase of single variants that are strongly correlated to disease. The aim of this workis to increase the reliability of these methods, as well as our understanding of theunderlying biological mechanisms that lead to disease. We first developed machinelearning (ML) based structural bioinformatics predictors that are able to predictmolecular features of proteins from the sequence alone. We then used these tools forin silico analysis of the molecular effects of known variants on the affected proteins,and integrated these data with other sources heterogenous sources of information,such as the essentiality of a gene, that put the variants into their broader biologicalcontext. With this information we created DEOGEN, a novel predictor in this field,which is able to deal with the two most common forms of genomic variation, namelySingle Nucleotide Variants (SNVs) and short Insertions and DELetions (INDELs).DEOGEN performs at least on par with other state of the art methods in this fieldon different datasets. The method was then extended with additional contextualdata and is now available as DEOGEN2 via a web server, which visualizes the pre-dicted results for all variants in most human proteins through an interactive interfacetargeted to both bioinformaticians and clinicians.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Rapid technological advances are providing unprecedented insights in the biologicalsciences, with massive amounts of data generated on genomic and protein sequences.These data continue to grow exponentially, and they are extremely valuable for com-putational tools where the effect of genomic variants on human health is predicted.State of the art tools in this field give varying results and only tend to agree in thecase of single variants that are strongly correlated to disease. The aim of this workis to increase the reliability of these methods, as well as our understanding of theunderlying biological mechanisms that lead to disease. We first developed machinelearning (ML) based structural bioinformatics predictors that are able to predictmolecular features of proteins from the sequence alone. We then used these tools forin silico analysis of the molecular effects of known variants on the affected proteins,and integrated these data with other sources heterogenous sources of information,such as the essentiality of a gene, that put the variants into their broader biologicalcontext. With this information we created DEOGEN, a novel predictor in this field,which is able to deal with the two most common forms of genomic variation, namelySingle Nucleotide Variants (SNVs) and short Insertions and DELetions (INDELs).DEOGEN performs at least on par with other state of the art methods in this fieldon different datasets. The method was then extended with additional contextualdata and is now available as DEOGEN2 via a web server, which visualizes the pre-dicted results for all variants in most human proteins through an interactive interfacetargeted to both bioinformaticians and clinicians. |
Han, The Anh T A H; Pereira, Luís Moniz; Martinez-Vaquero, Luis L A; Lenaerts, Tom Centralized vs. Personalized Commitments and their influence on Cooperation in Group Interactions Inproceedings In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), 2017, (Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)). @inproceedings{info:hdl:2013/243939, title = {Centralized vs. Personalized Commitments and their influence on Cooperation in Group Interactions}, author = {The Anh T A H Han and Luís Moniz Pereira and Luis L A Martinez-Vaquero and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243939}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17)}, note = {Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Fernandez-Domingos, Elias; Burguillo-Rial, Juan C; Lenaerts, Tom Reactive Versus Anticipative Decision Making in a Novel Gift-Giving Game Inproceedings In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), 2017, (Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)). @inproceedings{info:hdl:2013/243947, title = {Reactive Versus Anticipative Decision Making in a Novel Gift-Giving Game}, author = {Elias Fernandez-Domingos and Juan C Burguillo-Rial and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243947}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17)}, note = {Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Fernandez-Domingos, Elias; Burguillo-Rial, Juan C; Nowe, Ann; Lenaerts, Tom Coordinating Human and Agent Behavior in Collective-Risk Scenarios Inproceedings In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), 2017, (Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)). @inproceedings{info:hdl:2013/243948, title = {Coordinating Human and Agent Behavior in Collective-Risk Scenarios}, author = {Elias Fernandez-Domingos and Juan C Burguillo-Rial and Ann Nowe and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243948}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17)}, note = {Conference: 31st AAAI Conference on Artificial Intelligence (AAAI-17)(4-9 February 2017: San Francisco, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Lenaerts, Tom; Han, The Anh T A H; Pereira, Luis Moniz; Martinez-Vaquero, Luis A When apology is sincere, cooperation evolves, even when mistakes occur frequently Inproceedings In: Proceedings of the AISB Annual Convention, Symposium on Computational Modelling of Emotion: Theory and Applications, pp. 193-195, 2017, (Language of publication: en). @inproceedings{info:hdl:2013/262904, title = {When apology is sincere, cooperation evolves, even when mistakes occur frequently}, author = {Tom Lenaerts and The Anh T A H Han and Luis Moniz Pereira and Luis A Martinez-Vaquero}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262904}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the AISB Annual Convention, Symposium on Computational Modelling of Emotion: Theory and Applications}, pages = {193-195}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Han, The Anh T A H; Pereira, Luís Moniz; Martinez-Vaquero, Luis A; Lenaerts, Tom Evolution of commitment and level of participation in public goods games. Inproceedings In: Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 1431-1432, 2017, (Language of publication: en). @inproceedings{info:hdl:2013/262902, title = {Evolution of commitment and level of participation in public goods games.}, author = {The Anh T A H Han and Luís Moniz Pereira and Luis A Martinez-Vaquero and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262902}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)}, pages = {1431-1432}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Pereira, Luis Moniz; Lenaerts, Tom; Martinez-Vaquero, Luis A; others, Social manifestation of guilt leads to stable cooperation in multi-agent systems Inproceedings In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems(AAMAS), pp. 1422-1430, 2017, (Language of publication: en). @inproceedings{info:hdl:2013/262903, title = {Social manifestation of guilt leads to stable cooperation in multi-agent systems}, author = {Luis Moniz Pereira and Tom Lenaerts and Luis A Martinez-Vaquero and others}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262903}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems(AAMAS)}, pages = {1422-1430}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Pereira, Luis Moniz; Lenaerts, Tom; Han, The Anh T A H; Martinez-Vaquero, Luis A Evolutionary Game Theory Modelling of Guilt Inproceedings In: Proceedings of the AISB Annual Convention, Symposium on Computational Modelling of Emotion: Theory and Applications, pp. 189-192, 2017, (Language of publication: en). @inproceedings{info:hdl:2013/262905, title = {Evolutionary Game Theory Modelling of Guilt}, author = {Luis Moniz Pereira and Tom Lenaerts and The Anh T A H Han and Luis A Martinez-Vaquero}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262905}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the AISB Annual Convention, Symposium on Computational Modelling of Emotion: Theory and Applications}, pages = {189-192}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
`e, Nathaniel Mon P; Lenaerts, Tom; Pacheco, Jorge M J M; Dingli, David Evolutionary Dynamics of Paroxysmal Nocturnal Hemoglobinuria Miscellaneous 2017, (Conference: Benelux Bioinformatics Conference 2017). @misc{info:hdl:2013/267368, title = {Evolutionary Dynamics of Paroxysmal Nocturnal Hemoglobinuria}, author = {Nathaniel Mon P{`e}re and Tom Lenaerts and Jorge M J M Pacheco and David Dingli}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/267368/3/NMonPere_PNHneutraldriftPoster_print.pdf}, year = {2017}, date = {2017-01-01}, note = {Conference: Benelux Bioinformatics Conference 2017}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Grujić, Jelena; Lenaerts, Tom Looking for the strategies in the repeated prisoner's dilemma when the cooperation is established Miscellaneous 2017, (Conference: 3rd International Conference of Computational Social Science(07/2017: Cologne, Germany)). @misc{info:hdl:2013/263490, title = {Looking for the strategies in the repeated prisoner's dilemma when the cooperation is established}, author = {Jelena Grujić and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263490}, year = {2017}, date = {2017-01-01}, note = {Conference: 3rd International Conference of Computational Social Science(07/2017: Cologne, Germany)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Raimondi, Daniele; Daneels, Dorien; Moreau, Yves; Smits, Guillaume; Dooren, Sonia Van; Lenaerts, Tom Understanding mutational effects in digenic diseases Miscellaneous 2017, (Conference: (21-25 juillet 2017: Prague, Tch`eque)). @misc{info:hdl:2013/263493, title = {Understanding mutational effects in digenic diseases}, author = {Andrea Gazzo and Daniele Raimondi and Dorien Daneels and Yves Moreau and Guillaume Smits and Sonia Van Dooren and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263493}, year = {2017}, date = {2017-01-01}, note = {Conference: (21-25 juillet 2017: Prague, Tch`eque)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Raimondi, Daniele; cc, Ibrahim Tanyal; Ferte, Julien; Gazzo, Andrea; Orlando, Gabriele; Lenaerts, Tom; Rooman, Marianne; Vranken, Wim F DEOGEN2: prediction and interactive visualisation of SingleAmino Acid Variant deleteriousness in human proteins Miscellaneous 2017, (Conference: (21-25 Juillet 2017: Prague, Tch`eque)). @misc{info:hdl:2013/263494, title = {DEOGEN2: prediction and interactive visualisation of SingleAmino Acid Variant deleteriousness in human proteins}, author = {Daniele Raimondi and Ibrahim Tanyal{cc}in and Julien Ferte and Andrea Gazzo and Gabriele Orlando and Tom Lenaerts and Marianne Rooman and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263494}, year = {2017}, date = {2017-01-01}, note = {Conference: (21-25 Juillet 2017: Prague, Tch`eque)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Grujić, Jelena; Lenaerts, Tom Network influence on promotion of cooperation - Is there imitation? Miscellaneous 2017, (Conference: 8th Conference on Complex Networks.(03/2017: Dubrovnik, Croatia)). @misc{info:hdl:2013/263489, title = {Network influence on promotion of cooperation - Is there imitation?}, author = {Jelena Grujić and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263489}, year = {2017}, date = {2017-01-01}, note = {Conference: 8th Conference on Complex Networks.(03/2017: Dubrovnik, Croatia)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Smits, Guillaume; Gazzo, Andrea; Daneels, Dorien; Raimondi, Daniele; Papadimitriou, Sofia; Moreau, Yves; Dooren, Sonia Van; Lenaerts, Tom Understanding combinatorial effects of variants using machine learning and DIDA, the DIgenic diseases Database Miscellaneous 2017, (Conference: Genomics on Rare Disease(Wellcome Genome Campus, Hinxton, Cambridge, UK)). @misc{info:hdl:2013/263492, title = {Understanding combinatorial effects of variants using machine learning and DIDA, the DIgenic diseases Database}, author = {Guillaume Smits and Andrea Gazzo and Dorien Daneels and Daniele Raimondi and Sofia Papadimitriou and Yves Moreau and Sonia Van Dooren and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263492}, year = {2017}, date = {2017-01-01}, note = {Conference: Genomics on Rare Disease(Wellcome Genome Campus, Hinxton, Cambridge, UK)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Fernandez-Domingos, Elias; Burguillo-Rial, Juan C; Lenaerts, Tom Reactive Versus Anticipative Decision Making in a Novel Gift-Giving Game Miscellaneous 2017, (Conference: 29nd Benelux conference on Artificial Intelligence(28-29 nov 2017: Groningen)). @misc{info:hdl:2013/263491, title = {Reactive Versus Anticipative Decision Making in a Novel Gift-Giving Game}, author = {Elias Fernandez-Domingos and Juan C Burguillo-Rial and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/263491}, year = {2017}, date = {2017-01-01}, note = {Conference: 29nd Benelux conference on Artificial Intelligence(28-29 nov 2017: Groningen)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Raimondi, Daniele 2017, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/251313, title = {The effect of genome variation on human proteins: understanding variants and improving their deleteriousness prediction through extensive contextualisation}, author = {Daniele Raimondi}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/251313/5/ContratDiRaimondi.pdf}, year = {2017}, date = {2017-01-01}, abstract = {Rapid technological advances are providing unprecedented insights in the biologicalsciences, with massive amounts of data generated on genomic and protein sequences.These data continue to grow exponentially, and they are extremely valuable for com-putational tools where the effect of genomic variants on human health is predicted.State of the art tools in this field give varying results and only tend to agree in thecase of single variants that are strongly correlated to disease. The aim of this workis to increase the reliability of these methods, as well as our understanding of theunderlying biological mechanisms that lead to disease. We first developed machinelearning (ML) based structural bioinformatics predictors that are able to predictmolecular features of proteins from the sequence alone. We then used these tools forin silico analysis of the molecular effects of known variants on the affected proteins,and integrated these data with other sources heterogenous sources of information,such as the essentiality of a gene, that put the variants into their broader biologicalcontext. With this information we created DEOGEN, a novel predictor in this field,which is able to deal with the two most common forms of genomic variation, namelySingle Nucleotide Variants (SNVs) and short Insertions and DELetions (INDELs).DEOGEN performs at least on par with other state of the art methods in this fieldon different datasets. The method was then extended with additional contextualdata and is now available as DEOGEN2 via a web server, which visualizes the pre-dicted results for all variants in most human proteins through an interactive interfacetargeted to both bioinformaticians and clinicians.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Rapid technological advances are providing unprecedented insights in the biologicalsciences, with massive amounts of data generated on genomic and protein sequences.These data continue to grow exponentially, and they are extremely valuable for com-putational tools where the effect of genomic variants on human health is predicted.State of the art tools in this field give varying results and only tend to agree in thecase of single variants that are strongly correlated to disease. The aim of this workis to increase the reliability of these methods, as well as our understanding of theunderlying biological mechanisms that lead to disease. We first developed machinelearning (ML) based structural bioinformatics predictors that are able to predictmolecular features of proteins from the sequence alone. We then used these tools forin silico analysis of the molecular effects of known variants on the affected proteins,and integrated these data with other sources heterogenous sources of information,such as the essentiality of a gene, that put the variants into their broader biologicalcontext. With this information we created DEOGEN, a novel predictor in this field,which is able to deal with the two most common forms of genomic variation, namelySingle Nucleotide Variants (SNVs) and short Insertions and DELetions (INDELs).DEOGEN performs at least on par with other state of the art methods in this fieldon different datasets. The method was then extended with additional contextualdata and is now available as DEOGEN2 via a web server, which visualizes the pre-dicted results for all variants in most human proteins through an interactive interfacetargeted to both bioinformaticians and clinicians. |
2016 |
Huculeci, Radu Ion; Cilia, Elisa; Lyczek, Agatha; Buts, Lieven; Houben, Klaartje; Seeliger, Markus MA; van Nuland, Nico A J; Lenaerts, Tom Dynamically Coupled Residues within the SH2 Domain of FYN Are Key to Unlocking Its Activity. Journal Article In: Structure, 24 (11), pp. 1947-1959, 2016, (DOI: 10.1016/j.str.2016.08.016). @article{info:hdl:2013/239815, title = {Dynamically Coupled Residues within the SH2 Domain of FYN Are Key to Unlocking Its Activity.}, author = {Radu Ion Huculeci and Elisa Cilia and Agatha Lyczek and Lieven Buts and Klaartje Houben and Markus MA Seeliger and Nico A J van Nuland and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/239815/1/Elsevier_223442.pdf}, year = {2016}, date = {2016-01-01}, journal = {Structure}, volume = {24}, number = {11}, pages = {1947-1959}, abstract = {Src kinase activity is controlled by various mechanisms involving a coordinated movement of kinase and regulatory domains. Notwithstanding the extensive knowledge related to the backbone dynamics, little is known about the more subtle side-chain dynamics within the regulatory domains and their role in the activation process. Here, we show through experimental methyl dynamic results and predicted changes in side-chain conformational couplings that the SH2 structure of Fyn contains a dynamic network capable of propagating binding information. We reveal that binding the phosphorylated tail of Fyn perturbs a residue cluster near the linker connecting the SH2 and SH3 domains of Fyn, which is known to be relevant in the regulation of the activity of Fyn. Biochemical perturbation experiments validate that those residues are essential for inhibition of Fyn, leading to a gain of function upon mutation. These findings reveal how side-chain dynamics may facilitate the allosteric regulation of the different members of the Src kinase family.}, note = {DOI: 10.1016/j.str.2016.08.016}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src kinase activity is controlled by various mechanisms involving a coordinated movement of kinase and regulatory domains. Notwithstanding the extensive knowledge related to the backbone dynamics, little is known about the more subtle side-chain dynamics within the regulatory domains and their role in the activation process. Here, we show through experimental methyl dynamic results and predicted changes in side-chain conformational couplings that the SH2 structure of Fyn contains a dynamic network capable of propagating binding information. We reveal that binding the phosphorylated tail of Fyn perturbs a residue cluster near the linker connecting the SH2 and SH3 domains of Fyn, which is known to be relevant in the regulation of the activity of Fyn. Biochemical perturbation experiments validate that those residues are essential for inhibition of Fyn, leading to a gain of function upon mutation. These findings reveal how side-chain dynamics may facilitate the allosteric regulation of the different members of the Src kinase family. |
Han, The Anh T A H; Lenaerts, Tom A synergy of costly punishment and commitment in cooperation dilemmas Journal Article In: Adaptive behavior, 24 (4), pp. 237-248, 2016, (DOI: 10.1177/1059712316653451). @article{info:hdl:2013/236713, title = {A synergy of costly punishment and commitment in cooperation dilemmas}, author = {The Anh T A H Han and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/236713}, year = {2016}, date = {2016-01-01}, journal = {Adaptive behavior}, volume = {24}, number = {4}, pages = {237-248}, abstract = {To ensure cooperation in the Prisoner’s Dilemma, individuals may require prior commitments from others, subject to compensations when agreements to cooperate are violated. Alternatively, individuals may prefer to behave reactively, without arranging prior commitments, by simply punishing those who misbehave. These two mechanisms have been shown to promote the emergence of cooperation, yet are complementary in the way they aim to promote cooperation. Although both mechanisms have their specific limitations, either one of them can overcome the problems of the other. On one hand, costly punishment requires an excessive effect-to-cost ratio to be successful, and this ratio can be significantly reduced by arranging a prior commitment with a more limited compensation. On the other hand, commitment-proposing strategies can be suppressed by free-riding strategies that commit only when someone else is paying the cost to arrange the deal, whom in turn can be dealt with more effectively by reactive punishers. Using methods from Evolutionary Game Theory, we present here an analytical model showing that there is a wide range of settings for which the combined strategy outperforms either strategy by itself, leading to significantly higher levels of cooperation. Interestingly, the improvement is most significant when the cost of arranging commitments is sufficiently high and the penalty reaches a certain threshold, thereby overcoming the weaknesses of both mechanisms.}, note = {DOI: 10.1177/1059712316653451}, keywords = {}, pubstate = {published}, tppubtype = {article} } To ensure cooperation in the Prisoner’s Dilemma, individuals may require prior commitments from others, subject to compensations when agreements to cooperate are violated. Alternatively, individuals may prefer to behave reactively, without arranging prior commitments, by simply punishing those who misbehave. These two mechanisms have been shown to promote the emergence of cooperation, yet are complementary in the way they aim to promote cooperation. Although both mechanisms have their specific limitations, either one of them can overcome the problems of the other. On one hand, costly punishment requires an excessive effect-to-cost ratio to be successful, and this ratio can be significantly reduced by arranging a prior commitment with a more limited compensation. On the other hand, commitment-proposing strategies can be suppressed by free-riding strategies that commit only when someone else is paying the cost to arrange the deal, whom in turn can be dealt with more effectively by reactive punishers. Using methods from Evolutionary Game Theory, we present here an analytical model showing that there is a wide range of settings for which the combined strategy outperforms either strategy by itself, leading to significantly higher levels of cooperation. Interestingly, the improvement is most significant when the cost of arranging commitments is sufficiently high and the penalty reaches a certain threshold, thereby overcoming the weaknesses of both mechanisms. |
Tomás, Gil Da Rocha 2016, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/235915, title = {Gene Expression Markers of Proliferation and Differentiation in Cancer & The Extent of Prognostic Signals in the Cancer Transcriptome}, author = {Gil Da Rocha Tomás}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/235915/5/ContratGilDaRochaTomas.pdf}, year = {2016}, date = {2016-01-01}, abstract = {Le cancer est un groupe de maladies génétiques opérationnellement défini par uneprolifération cellulaire incontr^olée, impliquant une défaillance del'homeostasie de l'organisme. La recherche sur le cancer vise `a fournir desoutils diagnostics précis et des traitements ajustés pour chacune de cesmaladies. La technologie microarray permet la quantification de l'expression detous les produits de transcription du génome humain et constitue donc un outilpour mieux comprendre la nature polygénique du cancer. La technologiemicroarray permet `a la fois de découvrir de nouvelles classes de cancers et deprédire l'issue de maladie en fonction de profils d'expression préalables. Enoutre, l'utilisation de signatures d'expression géniques en tant que marqueursreprésentatifs de certains processus physiologiques moléculaires permetl'emploi de données microarray pour tester des hypoth`eses biologiques.Cette dissertation a deux objectifs: (a) établir la mesure dans laquelledes marqueurs d'expression génique de la différenciation et de la proliférationcellulaire peuvent contribuer `a la classification des maladies cancéreuses; et(b) d'évaluer l'étendue des signaux pronostiques dans les transcriptomescancéreux.Nous avons mis au point une méthode objective pour extraire des signatures dedifférentiation organe-spécifiques `a partir de données d'expression génique.Nous avons ensuite démontré qu'une signature génique de différentiationtissu-spécifique est capable de distinguer avec précision entre des sous-typeshistologiques de difficile classification dans un mod`ele thyro"idien. Ceci faitpreuve du potentiel valeur clinique et diagnostique des signatures dedifférentiation dans le domaine oncologique.Nous montrons aussi qu'une fraction non négligeable des transcriptomes cancéreuxest capable de prédire l'issue des respectives maladies, `a la suite d'uneanalyse systématique de 114 cohortes de profiles d'expression cancéreuxenglobant 19 types de cancers différents. Cet observation est probablement liée`a une vaste structure de corrélation parmis les profils d'expression cancéreux,partiellement expliquée par des variables techniques et biologiques. Cetteevidence met en cause l'utilisation généralisée d'associations statistiquesentre des marqueurs d'expression géniques et les issues de chaque maladie parmisplusieurs patients afin d'en déduire l'implication de mécanismes biologiquesparticuliers dans la progression du cancer.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Le cancer est un groupe de maladies génétiques opérationnellement défini par uneprolifération cellulaire incontr^olée, impliquant une défaillance del'homeostasie de l'organisme. La recherche sur le cancer vise `a fournir desoutils diagnostics précis et des traitements ajustés pour chacune de cesmaladies. La technologie microarray permet la quantification de l'expression detous les produits de transcription du génome humain et constitue donc un outilpour mieux comprendre la nature polygénique du cancer. La technologiemicroarray permet `a la fois de découvrir de nouvelles classes de cancers et deprédire l'issue de maladie en fonction de profils d'expression préalables. Enoutre, l'utilisation de signatures d'expression géniques en tant que marqueursreprésentatifs de certains processus physiologiques moléculaires permetl'emploi de données microarray pour tester des hypoth`eses biologiques.Cette dissertation a deux objectifs: (a) établir la mesure dans laquelledes marqueurs d'expression génique de la différenciation et de la proliférationcellulaire peuvent contribuer `a la classification des maladies cancéreuses; et(b) d'évaluer l'étendue des signaux pronostiques dans les transcriptomescancéreux.Nous avons mis au point une méthode objective pour extraire des signatures dedifférentiation organe-spécifiques `a partir de données d'expression génique.Nous avons ensuite démontré qu'une signature génique de différentiationtissu-spécifique est capable de distinguer avec précision entre des sous-typeshistologiques de difficile classification dans un mod`ele thyro"idien. Ceci faitpreuve du potentiel valeur clinique et diagnostique des signatures dedifférentiation dans le domaine oncologique.Nous montrons aussi qu'une fraction non négligeable des transcriptomes cancéreuxest capable de prédire l'issue des respectives maladies, `a la suite d'uneanalyse systématique de 114 cohortes de profiles d'expression cancéreuxenglobant 19 types de cancers différents. Cet observation est probablement liée`a une vaste structure de corrélation parmis les profils d'expression cancéreux,partiellement expliquée par des variables techniques et biologiques. Cetteevidence met en cause l'utilisation généralisée d'associations statistiquesentre des marqueurs d'expression géniques et les issues de chaque maladie parmisplusieurs patients afin d'en déduire l'implication de mécanismes biologiquesparticuliers dans la progression du cancer. |
Raimondi, Daniele; Gazzo, Andrea; Rooman, Marianne; Lenaerts, Tom; Vranken, Wim In: Bioinformatics, 32 (12), pp. 1797-1804, 2016, (DOI: 10.1093/bioinformatics/btw094). @article{info:hdl:2013/236692, title = {Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects}, author = {Daniele Raimondi and Andrea Gazzo and Marianne Rooman and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/236692}, year = {2016}, date = {2016-01-01}, journal = {Bioinformatics}, volume = {32}, number = {12}, pages = {1797-1804}, abstract = {Motivation: There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. Results: We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools.}, note = {DOI: 10.1093/bioinformatics/btw094}, keywords = {}, pubstate = {published}, tppubtype = {article} } Motivation: There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. Results: We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools. |
Ruano, Ana Zafra; Cilia, Elisa; Couceiro, José JR; Sanz, Javier Ruiz; Schymkowitz, Joost J; Rousseau, Frédéric; Luque, Irene; Lenaerts, Tom From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors. Journal Article In: PLoS computational biology, 12 (5), pp. e1004938, 2016, (DOI: 10.1371/journal.pcbi.1004938). @article{info:hdl:2013/232621, title = {From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors.}, author = {Ana Zafra Ruano and Elisa Cilia and José JR Couceiro and Javier Ruiz Sanz and Joost J Schymkowitz and Frédéric Rousseau and Irene Luque and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/232621/1/PMC4877006.pdf}, year = {2016}, date = {2016-01-01}, journal = {PLoS computational biology}, volume = {12}, number = {5}, pages = {e1004938}, abstract = {Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis.}, note = {DOI: 10.1371/journal.pcbi.1004938}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis. |
Świderska, Ewelina; Łasisz, Jakub; Byrski, Aleksander; Lenaerts, Tom; Samson, Dana; Indurkhya, Bipin; Nowe, Ann; Kisiel-Dorohinicki, Marek Measuring diversity of socio-cognitively inspired ACO search Journal Article In: Lecture notes in computer science, 9597 , pp. 393-408, 2016, (DOI: 10.1007/978-3-319-31204-0_26). @article{info:hdl:2013/231238, title = {Measuring diversity of socio-cognitively inspired ACO search}, author = {Ewelina Świderska and Jakub Łasisz and Aleksander Byrski and Tom Lenaerts and Dana Samson and Bipin Indurkhya and Ann Nowe and Marek Kisiel-Dorohinicki}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/231238}, year = {2016}, date = {2016-01-01}, journal = {Lecture notes in computer science}, volume = {9597}, pages = {393-408}, abstract = {In our recent research, we implemented an enhancement of Ant Colony Optimization incorporating the socio-cognitive dimension of perspective taking. Our initial results suggested that increasing the diversity of ant population — introducing different pheromones, different species and dedicated inter-species relations — yielded better results. In this paper, we explore the diversity issue by introducing novel diversity measurement strategies for ACO. Based on these strategies we compare both classic ACO and its socio-cognitive variation.}, note = {DOI: 10.1007/978-3-319-31204-0_26}, keywords = {}, pubstate = {published}, tppubtype = {article} } In our recent research, we implemented an enhancement of Ant Colony Optimization incorporating the socio-cognitive dimension of perspective taking. Our initial results suggested that increasing the diversity of ant population — introducing different pheromones, different species and dedicated inter-species relations — yielded better results. In this paper, we explore the diversity issue by introducing novel diversity measurement strategies for ACO. Based on these strategies we compare both classic ACO and its socio-cognitive variation. |
Bugajski, Iwan; Listkiewicz, Piotr; Byrski, Aleksander; Kisiel-Dorohinicki, Marek; Korczynski, Wojciech; Lenaerts, Tom; Samson, Dana; Indurkhya, Bipin; Nowe, Ann Enhancing particle swarm optimization with socio-cognitive inspirations Journal Article In: Procedia Computer Science, 80 , pp. 804-813, 2016, (DOI: 10.1016/j.procs.2016.05.370). @article{info:hdl:2013/240085, title = {Enhancing particle swarm optimization with socio-cognitive inspirations}, author = {Iwan Bugajski and Piotr Listkiewicz and Aleksander Byrski and Marek Kisiel-Dorohinicki and Wojciech Korczynski and Tom Lenaerts and Dana Samson and Bipin Indurkhya and Ann Nowe}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/240085/1/Elsevier_223712.pdf}, year = {2016}, date = {2016-01-01}, journal = {Procedia Computer Science}, volume = {80}, pages = {804-813}, abstract = {We incorporate socio-cognitively inspired metaheuristics, which we have used successfully in the ACO algorithms in our past research, into the classical particle swarm optimization algorithms. The swarm is divided into species and the particles get inspired not only by the global and local optima, but share their knowledge of the optima with neighboring agents belonging to other species. Our experimental research gathered for common benchmark functions tackled in 100 dimensions show that the metaheuristics are effective and perform better than the classic PSO. We experimented with various proportions of different species in the swarm population to find the best mix of population.}, note = {DOI: 10.1016/j.procs.2016.05.370}, keywords = {}, pubstate = {published}, tppubtype = {article} } We incorporate socio-cognitively inspired metaheuristics, which we have used successfully in the ACO algorithms in our past research, into the classical particle swarm optimization algorithms. The swarm is divided into species and the particles get inspired not only by the global and local optima, but share their knowledge of the optima with neighboring agents belonging to other species. Our experimental research gathered for common benchmark functions tackled in 100 dimensions show that the metaheuristics are effective and perform better than the classic PSO. We experimented with various proportions of different species in the swarm population to find the best mix of population. |
Martinez-Vaquero, Luis L A; Gruji'c, Jelena; Lenaerts, Tom Equivalence of cooperation indexes: Comment on Universal scaling for the dilemma strength in evolutionary games by Z. Wang et al. Journal Article In: Physics of life reviews, 16 , pp. 196-197, 2016, (Language of publication: en). @article{info:hdl:2013/243586, title = {Equivalence of cooperation indexes: Comment on Universal scaling for the dilemma strength in evolutionary games by Z. Wang et al.}, author = {Luis L A Martinez-Vaquero and Jelena Gruji'c and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243586}, year = {2016}, date = {2016-01-01}, journal = {Physics of life reviews}, volume = {16}, pages = {196-197}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Evolution of commitment and level of participation in public goods games Journal Article In: Autonomous agents and multi-agent systems, pp. 24, 2016, (DOI: 10.1007/s10458-016-9338-4). @article{info:hdl:2013/243591, title = {Evolution of commitment and level of participation in public goods games}, author = {The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243591}, year = {2016}, date = {2016-01-01}, journal = {Autonomous agents and multi-agent systems}, pages = {24}, abstract = {Before engaging in a group venture agents may require commitments from other members in the group, and based on the level of acceptance (participation) they can then decide whether it is worthwhile joining the group effort. Here, we show in the context of public goods games and using stochastic evolutionary game theory modelling, which implies imitation and mutation dynamics, that arranging prior commitments while imposing a minimal participation when interacting in groups induces agents to behave cooperatively. Our analytical and numerical results show that if the cost of arranging the commitment is sufficiently small compared to the cost of cooperation, commitment arranging behavior is frequent, leading to a high level of cooperation in the population. Moreover, an optimal participation level emerges depending both on the dilemma at stake and on the cost of arranging the commitment. Namely, the harsher the common good dilemma is, and the costlier it becomes to arrange the commitment, the more participants should explicitly commit to the agreement to ensure the success of the joint venture. Furthermore, considering that commitment deals may last for more than one encounter, we show that commitment proposers can be lenient in case of short-term agreements, yet should be strict in case of long-term interactions.}, note = {DOI: 10.1007/s10458-016-9338-4}, keywords = {}, pubstate = {published}, tppubtype = {article} } Before engaging in a group venture agents may require commitments from other members in the group, and based on the level of acceptance (participation) they can then decide whether it is worthwhile joining the group effort. Here, we show in the context of public goods games and using stochastic evolutionary game theory modelling, which implies imitation and mutation dynamics, that arranging prior commitments while imposing a minimal participation when interacting in groups induces agents to behave cooperatively. Our analytical and numerical results show that if the cost of arranging the commitment is sufficiently small compared to the cost of cooperation, commitment arranging behavior is frequent, leading to a high level of cooperation in the population. Moreover, an optimal participation level emerges depending both on the dilemma at stake and on the cost of arranging the commitment. Namely, the harsher the common good dilemma is, and the costlier it becomes to arrange the commitment, the more participants should explicitly commit to the agreement to ensure the success of the joint venture. Furthermore, considering that commitment deals may last for more than one encounter, we show that commitment proposers can be lenient in case of short-term agreements, yet should be strict in case of long-term interactions. |
Bugajski, Iwan; Byrski, Aleksander; Kisiel-Dorohinicki, Marek; Lenaerts, Tom; Samson, Dana; Indurkhya, Bipin Adaptation of Population Structure in Socio-cognitive Particle Swarm Optimization Journal Article In: Procedia Computer Science, 101 , pp. 177-186, 2016, (DOI: 10.1016/j.procs.2016.11.022). @article{info:hdl:2013/247231, title = {Adaptation of Population Structure in Socio-cognitive Particle Swarm Optimization}, author = {Iwan Bugajski and Aleksander Byrski and Marek Kisiel-Dorohinicki and Tom Lenaerts and Dana Samson and Bipin Indurkhya}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/247231/1/Elsevier_230858.pdf}, year = {2016}, date = {2016-01-01}, journal = {Procedia Computer Science}, volume = {101}, pages = {177-186}, abstract = {In the paper a modification of a Socio-cognitive Particle Swarm Optimization algorithm, recently proposed by the authors, is presented. This modification consists in devising a mechanism for dynamic adaptation of the population structure of the swarm. Besides the design and rationale for the approach, referring to the state-of-the-art PSO original algorithm and several of its modifications, experimental results tackling popular benchmark functions are presented and discussed in detail.}, note = {DOI: 10.1016/j.procs.2016.11.022}, keywords = {}, pubstate = {published}, tppubtype = {article} } In the paper a modification of a Socio-cognitive Particle Swarm Optimization algorithm, recently proposed by the authors, is presented. This modification consists in devising a mechanism for dynamic adaptation of the population structure of the swarm. Besides the design and rationale for the approach, referring to the state-of-the-art PSO original algorithm and several of its modifications, experimental results tackling popular benchmark functions are presented and discussed in detail. |
Cava, Claudia; Colaprico, Antonio; Bertoli, Gloria; Bontempi, Gianluca; Mauri, Giancarlo; Castiglioni, Isabella How interacting pathways are regulated by miRNAs in breast cancer subtypes Journal Article In: BMC bioinformatics, 17 , 2016, (DOI: 10.1186/s12859-016-1196-1). @article{info:hdl:2013/247204, title = {How interacting pathways are regulated by miRNAs in breast cancer subtypes}, author = {Claudia Cava and Antonio Colaprico and Gloria Bertoli and Gianluca Bontempi and Giancarlo Mauri and Isabella Castiglioni}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/247204}, year = {2016}, date = {2016-01-01}, journal = {BMC bioinformatics}, volume = {17}, abstract = {Background: An important challenge in cancer biology is to understand the complex aspects of the disease. It is increasingly evident that genes are not isolated from each other and the comprehension of how different genes are related to each other could explain biological mechanisms causing diseases. Biological pathways are important tools to reveal gene interaction and reduce the large number of genes to be studied by partitioning it into smaller paths. Furthermore, recent scientific evidence has proven that a combination of pathways, instead than a single element of the pathway or a single pathway, could be responsible for pathological changes in a cell. Results: In this paper we develop a new method that can reveal miRNAs able to regulate, in a coordinated way, networks of gene pathways. We applied the method to subtypes of breast cancer. The basic idea is the identification of pathways significantly enriched with differentially expressed genes among the different breast cancer subtypes and normal tissue. Looking at the pairs of pathways that were found to be functionally related, we created a network of dependent pathways and we focused on identifying miRNAs that could act as miRNA drivers in a coordinated regulation process. Conclusions: Our approach enables miRNAs identification that could have an important role in the development of breast cancer.}, note = {DOI: 10.1186/s12859-016-1196-1}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background: An important challenge in cancer biology is to understand the complex aspects of the disease. It is increasingly evident that genes are not isolated from each other and the comprehension of how different genes are related to each other could explain biological mechanisms causing diseases. Biological pathways are important tools to reveal gene interaction and reduce the large number of genes to be studied by partitioning it into smaller paths. Furthermore, recent scientific evidence has proven that a combination of pathways, instead than a single element of the pathway or a single pathway, could be responsible for pathological changes in a cell. Results: In this paper we develop a new method that can reveal miRNAs able to regulate, in a coordinated way, networks of gene pathways. We applied the method to subtypes of breast cancer. The basic idea is the identification of pathways significantly enriched with differentially expressed genes among the different breast cancer subtypes and normal tissue. Looking at the pairs of pathways that were found to be functionally related, we created a network of dependent pathways and we focused on identifying miRNAs that could act as miRNA drivers in a coordinated regulation process. Conclusions: Our approach enables miRNAs identification that could have an important role in the development of breast cancer. |
Grembergen, Olivier Van; Bizet, Martin; de Bony, Eric James; Calonne, Emilie; Putmans, Pascale; Brohée, Sylvain; Olsen, Catharina; Guo, Mingzhou; Bontempi, Gianluca; Sotiriou, Christos; Defrance, Matthieu; cc, Fran Portraying breast cancers with long noncoding RNAs Journal Article In: Science advances, 2 (9), 2016, (DOI: 10.1126/sciadv.1600220). @article{info:hdl:2013/236112, title = {Portraying breast cancers with long noncoding RNAs}, author = {Olivier Van Grembergen and Martin Bizet and Eric James de Bony and Emilie Calonne and Pascale Putmans and Sylvain Brohée and Catharina Olsen and Mingzhou Guo and Gianluca Bontempi and Christos Sotiriou and Matthieu Defrance and Fran{cc}ois Fuks}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/236112/5/PMC5010371.pdf}, year = {2016}, date = {2016-01-01}, journal = {Science advances}, volume = {2}, number = {9}, abstract = {Evidence is emerging that long noncoding RNAs (lncRNAs) may play a role in cancer development, but this role is not yet clear. We performed a genome-wide transcriptional survey to explore the lncRNA landscape across 995 breast tissue samples. We identified 215 lncRNAs whose genes are aberrantly expressed in breast tumors, as compared to normal samples. Unsupervised hierarchical clustering of breast tumors on the basis of their lncRNAs revealed four breast cancer subgroups that correlate tightly with PAM50-defined mRNA-based subtypes. Using multivariate analysis, we identified no less than 210 lncRNAs prognostic of clinical outcome. By analyzing the coexpression of lncRNA genes and protein-coding genes, we inferred potential functions of the 215 dysregulated lncRNAs. We then associated subtype-specific lncRNAs with key molecular processes involved in cancer. A correlation was observed, on the one hand, between luminal A–specific lncRNAs and the activation of phosphatidylinositol 3-kinase, fibroblast growth factor, and transforming growth factor–β pathways and, on the other hand, between basal-like–specific lncRNAs and the activation of epidermal growth factor receptor (EGFR)–dependent pathways and of the epithelial-to-mesenchymal transition. Finally, we showed that a specific lncRNA, which we called CYTOR, plays a role in breast cancer. We confirmed its predicted functions, showing that it regulates genes involved in the EGFR/mammalian target of rapamycin pathway and is required for cell proliferation, cell migration, and cytoskeleton organization. Overall, our work provides the most comprehensive analyses for lncRNA in breast cancers. Our findings suggest a wide range of biological functions associated with lncRNAs in breast cancer and provide a foundation for functional investigations that could lead to new therapeutic approaches.}, note = {DOI: 10.1126/sciadv.1600220}, keywords = {}, pubstate = {published}, tppubtype = {article} } Evidence is emerging that long noncoding RNAs (lncRNAs) may play a role in cancer development, but this role is not yet clear. We performed a genome-wide transcriptional survey to explore the lncRNA landscape across 995 breast tissue samples. We identified 215 lncRNAs whose genes are aberrantly expressed in breast tumors, as compared to normal samples. Unsupervised hierarchical clustering of breast tumors on the basis of their lncRNAs revealed four breast cancer subgroups that correlate tightly with PAM50-defined mRNA-based subtypes. Using multivariate analysis, we identified no less than 210 lncRNAs prognostic of clinical outcome. By analyzing the coexpression of lncRNA genes and protein-coding genes, we inferred potential functions of the 215 dysregulated lncRNAs. We then associated subtype-specific lncRNAs with key molecular processes involved in cancer. A correlation was observed, on the one hand, between luminal A–specific lncRNAs and the activation of phosphatidylinositol 3-kinase, fibroblast growth factor, and transforming growth factor–β pathways and, on the other hand, between basal-like–specific lncRNAs and the activation of epidermal growth factor receptor (EGFR)–dependent pathways and of the epithelial-to-mesenchymal transition. Finally, we showed that a specific lncRNA, which we called CYTOR, plays a role in breast cancer. We confirmed its predicted functions, showing that it regulates genes involved in the EGFR/mammalian target of rapamycin pathway and is required for cell proliferation, cell migration, and cytoskeleton organization. Overall, our work provides the most comprehensive analyses for lncRNA in breast cancers. Our findings suggest a wide range of biological functions associated with lncRNAs in breast cancer and provide a foundation for functional investigations that could lead to new therapeutic approaches. |
Bontempi, Gianluca A blocking strategy for ranking features according to probabilistic relevance Journal Article In: Lecture notes in computer science, 10122 LNCS , pp. 59-69, 2016, (DOI: 10.1007/978-3-319-51469-7_5). @article{info:hdl:2013/247751, title = {A blocking strategy for ranking features according to probabilistic relevance}, author = {Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/247751}, year = {2016}, date = {2016-01-01}, journal = {Lecture notes in computer science}, volume = {10122 LNCS}, pages = {59-69}, abstract = {The paper presents an algorithm to rank features in “small number of samples, large dimensionality” problems according to probabilistic feature relevance, a novel definition of feature relevance. Probabilistic feature relevance, intended as expected weak relevance, is introduced in order to address the problem of estimating conventional feature relevance in data settings where the number of samples is much smaller than the number of features. The resulting ranking algorithm relies on a blocking approach for estimation and consists in creating a large number of identical configurations to measure the conditional information of each feature in a paired manner. Its implementation can be made embarrassingly parallel in the case of very large n. A number of experiments on simulated and real data confirms the interest of the approach.}, note = {DOI: 10.1007/978-3-319-51469-7_5}, keywords = {}, pubstate = {published}, tppubtype = {article} } The paper presents an algorithm to rank features in “small number of samples, large dimensionality” problems according to probabilistic feature relevance, a novel definition of feature relevance. Probabilistic feature relevance, intended as expected weak relevance, is introduced in order to address the problem of estimating conventional feature relevance in data settings where the number of samples is much smaller than the number of features. The resulting ranking algorithm relies on a blocking approach for estimation and consists in creating a large number of identical configurations to measure the conditional information of each feature in a paired manner. Its implementation can be made embarrassingly parallel in the case of very large n. A number of experiments on simulated and real data confirms the interest of the approach. |
Silva, Tiago T C; Colaprico, Antonio; Olsen, Catharina; D'Angelo, Fulvio; Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan In: F1000Research, 5 , 2016, (DOI: 10.12688/F1000RESEARCH.8923.1). @article{info:hdl:2013/247761, title = {TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages [version 1; referees: 1 approved, 1 approved with reservations]}, author = {Tiago T C Silva and Antonio Colaprico and Catharina Olsen and Fulvio D'Angelo and Gianluca Bontempi and Michele Ceccarelli and Houtan Noushmehr}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/247761}, year = {2016}, date = {2016-01-01}, journal = {F1000Research}, volume = {5}, abstract = {Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no comprehensive tool that provides a complete integrative analysis harnessing the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative downstream analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data and by using Roadmap and ENCODE data, we provide a workplan to identify candidate biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAtoolbox, TCGAbiolinks.}, note = {DOI: 10.12688/F1000RESEARCH.8923.1}, keywords = {}, pubstate = {published}, tppubtype = {article} } Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no comprehensive tool that provides a complete integrative analysis harnessing the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative downstream analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data and by using Roadmap and ENCODE data, we provide a workplan to identify candidate biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAtoolbox, TCGAbiolinks. |
"e, Yann-A; Homolova, Adriana; Bontempi, Gianluca OpenTED browser: Insights into European Public Spendings Journal Article In: CEUR Workshop Proceedings, 1831 , 2016, (Language of publication: en). @article{info:hdl:2013/253331, title = {OpenTED browser: Insights into European Public Spendings}, author = {Yann-A{"e}l Le Borgne and Adriana Homolova and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/253331}, year = {2016}, date = {2016-01-01}, journal = {CEUR Workshop Proceedings}, volume = {1831}, abstract = {We present the OpenTED browser, a Web application allowing to interactively browse public spending data related to public procurements in the European Union. The application relies on Open Data recently published by the European Commission and the Publications Office of the European Union, from which we imported a curated dataset of 4.2 million contract award notices spanning the period 2006-2015. The application is designed to easily filter notices and visualise relationships between public contracting authorities and private contractors. The simple design allows for example to quickly find information about who the biggest suppliers of local governments are, and the nature of the contracted goods and services. We believe the tool, which we make Open Source, is a valuable source of information for journalists, NGOs, analysts and citizens for getting information on public procurement data, from large scale trends to local municipal developments.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } We present the OpenTED browser, a Web application allowing to interactively browse public spending data related to public procurements in the European Union. The application relies on Open Data recently published by the European Commission and the Publications Office of the European Union, from which we imported a curated dataset of 4.2 million contract award notices spanning the period 2006-2015. The application is designed to easily filter notices and visualise relationships between public contracting authorities and private contractors. The simple design allows for example to quickly find information about who the biggest suppliers of local governments are, and the nature of the contracted goods and services. We believe the tool, which we make Open Source, is a valuable source of information for journalists, NGOs, analysts and citizens for getting information on public procurement data, from large scale trends to local municipal developments. |
Dendievel, Rémi Sequential stopping under different environments of weak information PhD Thesis 2016, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/239624, title = {Sequential stopping under different environments of weak information}, author = {Rémi Dendievel}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/239624/5/contratDendievel.pdf}, year = {2016}, date = {2016-01-01}, abstract = {Notre th`ese s’articule autour du th`eme de l’utilisation optimale de l’information contenue dans un mod`ele probabiliste flexible. Dans le premier chapitre, nous couvrons des résultats bien connus des martingales comme le théor`eme de convergence dit L1 des martingales et le théor`eme d’arr^et. Nous discutons de probl`emes ouverts similaires au «last arrival problem» (Bruss et Yor, 2012) qui sont des vrais défis du point de vue théorique et nous ne pouvons que conjecturer la stratégie optimale.Dans les chapitres suivants, nous résolvons des extensions de probl`emes d’arr^et optimal proposés par R. R. Weber (U. Cambridge), basés sur le «théor`eme des odds» (Bruss, 2000). En résumé, il s’agit d’effectuer une seule action (un seul arr^et) lorsque deux suites d’observations indépendantes sont observées simultanément. Nous donnons la solution `a ces probl`emes pour un nombre (fixé) choisi de processus.Le chapitre suivant passe en revue la plupart des développements récents (depuis 2000) réalisés autour du «théor`eme des odds» (Bruss, 2000). Le matériel présenté fut publié (2013), il a donc été mis `a jour dans cette th`ese pour inclure les derniers résultats depuis cette date.Puis nous réservons un chapitre pour une solution explicite pour un cas particulier du Probl`eme d’arr^et optimal de Robbins. Ce chapitre est basé sur un article publié par l’auteur en collaboration avec le professeur Swan (Université de Li`ege). Ce chapitre offre une belle illustration des difficultés rencontrées lorsque trop d’information sur les variables est contenue dans le mod`ele. La solution optimale de ce probl`eme dans le cas général n’est pas connue. Par contre, contre-intuitivement, dans le «last arrival problem» mentionné plus haut, moins d’information permet, comme nous le montrons, de trouver en effet la solution optimale.La th`ese contient un dernier chapitre sur un probl`eme de nature plus combinatoire que nous pouvons lier `a la théorie des graphes dans une certaine mesure. Nous étudions le processus de création d’un graphe aléatoire particulier et les propriétés des cycles créés par celui-ci. Le probl`eme est séquentiel et permet d’envisager des probl`emes d’arr^et intéressants. Cette étude a des conséquences en théorie des graphes, en analyse combinatoire ainsi qu’en science de la chimie combinatoire pour les applications. Un de nos résultats est analogue au résultat de Janson (1987) relatif au premier cycle créé pendant la création de graphes aléatoires.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Notre th`ese s’articule autour du th`eme de l’utilisation optimale de l’information contenue dans un mod`ele probabiliste flexible. Dans le premier chapitre, nous couvrons des résultats bien connus des martingales comme le théor`eme de convergence dit L1 des martingales et le théor`eme d’arr^et. Nous discutons de probl`emes ouverts similaires au «last arrival problem» (Bruss et Yor, 2012) qui sont des vrais défis du point de vue théorique et nous ne pouvons que conjecturer la stratégie optimale.Dans les chapitres suivants, nous résolvons des extensions de probl`emes d’arr^et optimal proposés par R. R. Weber (U. Cambridge), basés sur le «théor`eme des odds» (Bruss, 2000). En résumé, il s’agit d’effectuer une seule action (un seul arr^et) lorsque deux suites d’observations indépendantes sont observées simultanément. Nous donnons la solution `a ces probl`emes pour un nombre (fixé) choisi de processus.Le chapitre suivant passe en revue la plupart des développements récents (depuis 2000) réalisés autour du «théor`eme des odds» (Bruss, 2000). Le matériel présenté fut publié (2013), il a donc été mis `a jour dans cette th`ese pour inclure les derniers résultats depuis cette date.Puis nous réservons un chapitre pour une solution explicite pour un cas particulier du Probl`eme d’arr^et optimal de Robbins. Ce chapitre est basé sur un article publié par l’auteur en collaboration avec le professeur Swan (Université de Li`ege). Ce chapitre offre une belle illustration des difficultés rencontrées lorsque trop d’information sur les variables est contenue dans le mod`ele. La solution optimale de ce probl`eme dans le cas général n’est pas connue. Par contre, contre-intuitivement, dans le «last arrival problem» mentionné plus haut, moins d’information permet, comme nous le montrons, de trouver en effet la solution optimale.La th`ese contient un dernier chapitre sur un probl`eme de nature plus combinatoire que nous pouvons lier `a la théorie des graphes dans une certaine mesure. Nous étudions le processus de création d’un graphe aléatoire particulier et les propriétés des cycles créés par celui-ci. Le probl`eme est séquentiel et permet d’envisager des probl`emes d’arr^et intéressants. Cette étude a des conséquences en théorie des graphes, en analyse combinatoire ainsi qu’en science de la chimie combinatoire pour les applications. Un de nos résultats est analogue au résultat de Janson (1987) relatif au premier cycle créé pendant la création de graphes aléatoires. |
Zisis, Ioannis The Effect of Group Formation on Behaviour: An Experimental and Evolutionary Analysis PhD Thesis 2016, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/231974b, title = {The Effect of Group Formation on Behaviour: An Experimental and Evolutionary Analysis}, author = {Ioannis Zisis}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/231974/5/contratZisis.pdf}, year = {2016}, date = {2016-01-01}, abstract = {The division of resources between a group of people may cause con- flicts: Individuals with varying roles and responsibilities will claim different shares of the surplus to be divided. In this dissertation, we analyze how the decision to form a group will influence the bargaining behaviour of the members of that group. People will act collectively as certain tasks may require the participation of a specific number of individuals before it can be completed. We examine whether certain mechanisms can efficiently promote group formation for the sake of surplus production, and then, what will be the effect of these mechanisms on the behaviour of the group members. For these reasons, we constructed a novel surplus production and distribution interaction which we call the Anticipation Game (AG). The AG can be played between only two players (pairwise interaction) or among more then two players (group interaction). In our study we will analyze both the pairwise AG and the group version of AG, first by obtaining our own empirical data and then by performing a stochastic evolutionary analysis. We aim to provide answers on: i) how will a reputation based partner approval mechanism influence the surplus distribution in both the pairwise and the group AG, ii) will then limitations in obtaining the reputation of a potential partner alter the results of the pairwise AG?, iii) will we notice any effect on the behaviour of players when they can repeatedly cooperate with the same partners in group interactions, iv) how natural selection may have shaped the behaviour of players in group formation interactions (both pairwise and group AG evolutionary analysis).}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The division of resources between a group of people may cause con- flicts: Individuals with varying roles and responsibilities will claim different shares of the surplus to be divided. In this dissertation, we analyze how the decision to form a group will influence the bargaining behaviour of the members of that group. People will act collectively as certain tasks may require the participation of a specific number of individuals before it can be completed. We examine whether certain mechanisms can efficiently promote group formation for the sake of surplus production, and then, what will be the effect of these mechanisms on the behaviour of the group members. For these reasons, we constructed a novel surplus production and distribution interaction which we call the Anticipation Game (AG). The AG can be played between only two players (pairwise interaction) or among more then two players (group interaction). In our study we will analyze both the pairwise AG and the group version of AG, first by obtaining our own empirical data and then by performing a stochastic evolutionary analysis. We aim to provide answers on: i) how will a reputation based partner approval mechanism influence the surplus distribution in both the pairwise and the group AG, ii) will then limitations in obtaining the reputation of a potential partner alter the results of the pairwise AG?, iii) will we notice any effect on the behaviour of players when they can repeatedly cooperate with the same partners in group interactions, iv) how natural selection may have shaped the behaviour of players in group formation interactions (both pairwise and group AG evolutionary analysis). |
Lenaerts, Tom Conditions for the evolution of apology and forgiveness in populations of autonomous agents Inproceedings In: The 2016 AAAI Spring Symposium Series: Technical Reports, 2016, (Conference: AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”.(Stanford, USA)). @inproceedings{info:hdl:2013/243723, title = {Conditions for the evolution of apology and forgiveness in populations of autonomous agents}, author = {Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243723}, year = {2016}, date = {2016-01-01}, booktitle = {The 2016 AAAI Spring Symposium Series: Technical Reports}, note = {Conference: AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”.(Stanford, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Pereira, Luís Moniz; Han, The Anh T A H; Martinez-Vaquero, Luis L A; Lenaerts, Tom Guilt for non-humans Inproceedings In: The 2016 AAAI Spring Symposium Series: Technical reports, 2016, (Conference: The 2016 AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”(21-23 March 2016: Stanford, USA)). @inproceedings{info:hdl:2013/243750, title = {Guilt for non-humans}, author = {Luís Moniz Pereira and The Anh T A H Han and Luis L A Martinez-Vaquero and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243750}, year = {2016}, date = {2016-01-01}, booktitle = {The 2016 AAAI Spring Symposium Series: Technical reports}, note = {Conference: The 2016 AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”(21-23 March 2016: Stanford, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Emergence of Cooperation in Group Interactions: Avoidance versus Restriction Inproceedings In: The 2016 AAAI Spring Symposium Series: Technical Reports, 2016, (Conference: AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”(21-23 March 2016: Stanford, USA)). @inproceedings{info:hdl:2013/243792, title = {Emergence of Cooperation in Group Interactions: Avoidance versus Restriction}, author = {The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243792}, year = {2016}, date = {2016-01-01}, booktitle = {The 2016 AAAI Spring Symposium Series: Technical Reports}, note = {Conference: AAAI Spring Symposium on “Ethical and Moral Considerations in Non-Human Agents”(21-23 March 2016: Stanford, USA)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom; Zisis, Ioannis Endogenous Repeated Cooperation and Surplus Distribution - An Experimental Analysis Technical Report 2016, (Language of publication: en). @techreport{info:hdl:2013/228058, title = {Endogenous Repeated Cooperation and Surplus Distribution - An Experimental Analysis}, author = {Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts and Ioannis Zisis}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/228058/3/2016-08-DIGUIDA_HAN_KIRCHSTEIGER_LENAERTS_ZISIS-endog.pdf}, year = {2016}, date = {2016-01-01}, abstract = {This paper investigates experimentally how the endogenous group formation combined with the possibility of repeated interaction impacts cooperation levels and surplus distribution. We developed a Surplus Production Distribution Game where the cooperation of four agents is needed to produce a surplus. In case of cooperation, two of the four subjects, the distributors, decided how much of surplus each of them wanted to give to the two other agents, the receivers. This game was played repeatedly with different matching procedures. In the Re-match Treatment (RT) the subjects got randomly re-matched every round, while in the Endogenous-match Treatment (ET) a group was maintained as long as its members cooperated. There was also a Base treatment (BT) where cooperation was exogenously enforced. We found that the distributor's contributions were higher in the ET and the RT than in the BT - unsurprisingly, receivers' possibility to refuse cooperation led to more equal surplus distributions. But contrary to commonly hold beliefs, the possibility of repeated interaction did not lead to higher cooperation levels and more equal allocations of the surplus. Instead, endogenous group formation combined with the possibility of repeated interaction led to self-selection of the subjects in the ET. The endogenous group duration varied drastically between different groups in the ET, with long-lived groups exhibiting contributions and cooperation levels higher than in the RT, while short-lived groups showed contributions and cooperation levels lower than in the RT. Furthermore, for given contribution levels, receivers were more likely to refuse cooperation when their average relationship length was short. This shows that long-lived groups consisted of generous distributors and not so demanding receivers, while ungenerous distributors and demanding receivers formed short-lived groups. Hence, the possibility of repeated interaction does not necessarily increase cooperation and efficiency levels when combined with endogenous group formation. Rather, such a situation might lead to self-selection of agents.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } This paper investigates experimentally how the endogenous group formation combined with the possibility of repeated interaction impacts cooperation levels and surplus distribution. We developed a Surplus Production Distribution Game where the cooperation of four agents is needed to produce a surplus. In case of cooperation, two of the four subjects, the distributors, decided how much of surplus each of them wanted to give to the two other agents, the receivers. This game was played repeatedly with different matching procedures. In the Re-match Treatment (RT) the subjects got randomly re-matched every round, while in the Endogenous-match Treatment (ET) a group was maintained as long as its members cooperated. There was also a Base treatment (BT) where cooperation was exogenously enforced. We found that the distributor's contributions were higher in the ET and the RT than in the BT - unsurprisingly, receivers' possibility to refuse cooperation led to more equal surplus distributions. But contrary to commonly hold beliefs, the possibility of repeated interaction did not lead to higher cooperation levels and more equal allocations of the surplus. Instead, endogenous group formation combined with the possibility of repeated interaction led to self-selection of the subjects in the ET. The endogenous group duration varied drastically between different groups in the ET, with long-lived groups exhibiting contributions and cooperation levels higher than in the RT, while short-lived groups showed contributions and cooperation levels lower than in the RT. Furthermore, for given contribution levels, receivers were more likely to refuse cooperation when their average relationship length was short. This shows that long-lived groups consisted of generous distributors and not so demanding receivers, while ungenerous distributors and demanding receivers formed short-lived groups. Hence, the possibility of repeated interaction does not necessarily increase cooperation and efficiency levels when combined with endogenous group formation. Rather, such a situation might lead to self-selection of agents. |
Lenaerts, Tom; Gazzo, Andrea; Daneels, Dorien; Cilia, Elisa; Bonduelle, Maryse; Abramowicz, Marc; Dooren, Sonia Van; Smits, Guillaume Exploring the variant combinations in the digenic diseases database DIDA Miscellaneous 2016, (Conference: uropean Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)). @misc{info:hdl:2013/243711, title = {Exploring the variant combinations in the digenic diseases database DIDA}, author = {Tom Lenaerts and Andrea Gazzo and Dorien Daneels and Elisa Cilia and Maryse Bonduelle and Marc Abramowicz and Sonia Van Dooren and Guillaume Smits}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243711}, year = {2016}, date = {2016-01-01}, note = {Conference: uropean Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Apology and forgiveness evolve to resolve failures in cooperative agreements Miscellaneous 2016, (Conference: Benelux Conference on Artificial Intelligence(10-11 November 2016: Amsterdam. the Netherlands)). @misc{info:hdl:2013/243707, title = {Apology and forgiveness evolve to resolve failures in cooperative agreements}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243707}, year = {2016}, date = {2016-01-01}, note = {Conference: Benelux Conference on Artificial Intelligence(10-11 November 2016: Amsterdam. the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Raimondi, Daniele; Gazzo, Andrea; Rooman, Marianne; Lenaerts, Tom; Vranken, Wim 2016, (Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)). @misc{info:hdl:2013/243712, title = {Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects}, author = {Daniele Raimondi and Andrea Gazzo and Marianne Rooman and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243712}, year = {2016}, date = {2016-01-01}, note = {Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Raimondi, Daniele; Daneels, Dorien; Smits, Guillaume; Dooren, Sonia Van; Lenaerts, Tom Predicting oligogenic effects using digenic disease data Miscellaneous 2016, (Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)). @misc{info:hdl:2013/243713, title = {Predicting oligogenic effects using digenic disease data}, author = {Andrea Gazzo and Daniele Raimondi and Dorien Daneels and Guillaume Smits and Sonia Van Dooren and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243713}, year = {2016}, date = {2016-01-01}, note = {Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Orlando, Gabriele; Raimondi, Daniele; Lenaerts, Tom; Vranken, Wim F Rigapollo, a HMM-SVM based approach to sequence alignment Miscellaneous 2016, (Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)). @misc{info:hdl:2013/243714, title = {Rigapollo, a HMM-SVM based approach to sequence alignment}, author = {Gabriele Orlando and Daniele Raimondi and Tom Lenaerts and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243714}, year = {2016}, date = {2016-01-01}, note = {Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Forgiveness evolves to ensure cooperation in long-term agreements Miscellaneous 2016, (Conference: Conference on Complex Systems(19-22 September 2016: Amsterdam, the Netherlands)). @misc{info:hdl:2013/243710, title = {Forgiveness evolves to ensure cooperation in long-term agreements}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243710}, year = {2016}, date = {2016-01-01}, note = {Conference: Conference on Complex Systems(19-22 September 2016: Amsterdam, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Papadimitriou, Sofia; Gazzo, Andrea; Smits, Guillaume; Nowe, Ann; Lenaerts, Tom Predicting digenic variant effects with DIDA Miscellaneous 2016, (Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)). @misc{info:hdl:2013/243720, title = {Predicting digenic variant effects with DIDA}, author = {Sofia Papadimitriou and Andrea Gazzo and Guillaume Smits and Ann Nowe and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243720}, year = {2016}, date = {2016-01-01}, note = {Conference: European Conference on Computational Biology(3-7 September 2016: Den Hague, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Wit, Nathalie De; Lenaerts, Tom Cluster analysis of neurodevelopmental diseases with Spark. Informatique Masters Thesis 2016, (Language of publication: fr). @mastersthesis{info:hdl:2013/243961, title = {Cluster analysis of neurodevelopmental diseases with Spark. Informatique}, author = {Nathalie De Wit and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243961}, year = {2016}, date = {2016-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Steckelmacher, Denis; Lenaerts, Tom Reinforcement learning in complex environments: Evaluating algorithms on image classification. Informatique Masters Thesis 2016, (Language of publication: fr). @mastersthesis{info:hdl:2013/243960, title = {Reinforcement learning in complex environments: Evaluating algorithms on image classification. Informatique}, author = {Denis Steckelmacher and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243960}, year = {2016}, date = {2016-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Velde, Thibaut Van De; Lenaerts, Tom Concevoir un mod`ele de jeu `a ressources communes au sein d'un cloud. Informatique Masters Thesis 2016, (Language of publication: fr). @mastersthesis{info:hdl:2013/243959, title = {Concevoir un mod`ele de jeu `a ressources communes au sein d'un cloud. Informatique}, author = {Thibaut Van De Velde and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243959}, year = {2016}, date = {2016-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Zisis, Ioannis The Effect of Group Formation on Behaviour: An Experimental and Evolutionary Analysis PhD Thesis 2016, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/231974, title = {The Effect of Group Formation on Behaviour: An Experimental and Evolutionary Analysis}, author = {Ioannis Zisis}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/231974/5/contratZisis.pdf}, year = {2016}, date = {2016-01-01}, abstract = {The division of resources between a group of people may cause con- flicts: Individuals with varying roles and responsibilities will claim different shares of the surplus to be divided. In this dissertation, we analyze how the decision to form a group will influence the bargaining behaviour of the members of that group. People will act collectively as certain tasks may require the participation of a specific number of individuals before it can be completed. We examine whether certain mechanisms can efficiently promote group formation for the sake of surplus production, and then, what will be the effect of these mechanisms on the behaviour of the group members. For these reasons, we constructed a novel surplus production and distribution interaction which we call the Anticipation Game (AG). The AG can be played between only two players (pairwise interaction) or among more then two players (group interaction). In our study we will analyze both the pairwise AG and the group version of AG, first by obtaining our own empirical data and then by performing a stochastic evolutionary analysis. We aim to provide answers on: i) how will a reputation based partner approval mechanism influence the surplus distribution in both the pairwise and the group AG, ii) will then limitations in obtaining the reputation of a potential partner alter the results of the pairwise AG?, iii) will we notice any effect on the behaviour of players when they can repeatedly cooperate with the same partners in group interactions, iv) how natural selection may have shaped the behaviour of players in group formation interactions (both pairwise and group AG evolutionary analysis).}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The division of resources between a group of people may cause con- flicts: Individuals with varying roles and responsibilities will claim different shares of the surplus to be divided. In this dissertation, we analyze how the decision to form a group will influence the bargaining behaviour of the members of that group. People will act collectively as certain tasks may require the participation of a specific number of individuals before it can be completed. We examine whether certain mechanisms can efficiently promote group formation for the sake of surplus production, and then, what will be the effect of these mechanisms on the behaviour of the group members. For these reasons, we constructed a novel surplus production and distribution interaction which we call the Anticipation Game (AG). The AG can be played between only two players (pairwise interaction) or among more then two players (group interaction). In our study we will analyze both the pairwise AG and the group version of AG, first by obtaining our own empirical data and then by performing a stochastic evolutionary analysis. We aim to provide answers on: i) how will a reputation based partner approval mechanism influence the surplus distribution in both the pairwise and the group AG, ii) will then limitations in obtaining the reputation of a potential partner alter the results of the pairwise AG?, iii) will we notice any effect on the behaviour of players when they can repeatedly cooperate with the same partners in group interactions, iv) how natural selection may have shaped the behaviour of players in group formation interactions (both pairwise and group AG evolutionary analysis). |
Tarabichi, Maxime Integrative analyses of genome-wide transcriptomic and genomic thyroid cancer profiles PhD Thesis 2016, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/225138, title = {Integrative analyses of genome-wide transcriptomic and genomic thyroid cancer profiles}, author = {Maxime Tarabichi}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/225138/5/ContratMaximeTarabichi.pdf}, year = {2016}, date = {2016-01-01}, abstract = {Cette th`ese en bioinformatique a été réalisée entre 2010 et 2015 dans le groupe du Pr. Vincent Detours `a l’Institut de Recherche Interdisciplinaire en Biologie Humaine et Moléculaire. Nous avons analysé des données génomiques et transcriptomiques provenant de carcinomes papillaires de la thyro"ide (CPTs) et leurs tissus non-cancéreux adjacents. La premi`ere partie étudiait les différences transcriptomiques entre CPTs post-Tchernobyl et CPTs sporadiques, et leur tissus non-cancéreux adjacents. Dans notre cohorte, les cas sporadiques étaient en moyenne et significativement un an plus jeunes. Apr`es un ajustement des données transcriptionnelles pour l'^age, pr`es de 400 g`enes étaient plus exprimés dans les tissus adjacents des patients exposés aux radiations. Cependant, nous n’avons pu détecter aucune surreprésentation de groupe de g`enes participant `a des fonctions biologiques connues. Il était possible de distinguer les cas sporadiques des cas post-Tchernobyl sur base des transcriptomes de leurs tissus adjacents, avec une précision de ~70%. Cette surexpression de g`enes dans les tissus non-cancéreux adjacents pourrait ^etre liée `a une radiosensibilité accrue dans le groupe des patients exposés aux radiations de Tchernobyl. Dans la deuxi`eme étude, nous avons intégré des données provenant des patients de la premi`ere partie, incluant les nombres de copies d'ADN des CPTs, le génotype de plus de 400.000 SNPs dans le sang et les données transcriptionnelles des CPTs et leurs tissus non-cancéreux adjacents. En reproduisant les résultats d'une étude précédente, nous avons retrouvé la région 7q11.23 dupliquée exclusivement dans un tiers des patients exposés aux radiations. Dans une étude indépendante, un autre groupe a montré que la duplication de cette région était plus fréquente dans une population de lignées cellulaires radiosensibles que dans la population humaine normale. Cependant, en analysant les transcriptomes des patients présentant cette duplication, nous n'avons pas détecté de différence d’expression des g`enes codés dans cette région génomique. En outre, aucun génotype de SNP n'était significativement lié `a l'exposition aux radiations. En conclusion, les résultats confirment qu'un tiers des CPTs post-Tchernobyl ont des traces d'un dég^at radio-sensibilsant dans leur ADN. Dans une troisi`eme étude, nous avons étudié les différences transcriptionnelles entre CPTs et leurs métastases ganglionnaires (MGs) associées, ainsi qu'entre des CPTs développant des MGs (N+) et des CPTs ne développant pas de MGs (N0). Des études précédentes comparant les MGs et leurs tumeurs associées impliquant d’autres organes ont montré une surexpression de g`enes dans les MGs, liés aux cellules immunitaires. Ce signal provient du tissu contaminant environnant les MGs. Pour se défaire de ce signal contaminant, d’autres études ont microdisséqué au laser les parties tumorales des MGs. Cependant, la microdissection retire aussi le stroma associé `a la tumeur, alors que celui-ci est justement impliqué dans la progression tumorale. Gr^ace `a une méthode originale, nous avons corrigé nos données d’expression des MGs pour leur contenu en contaminant ganglionnaire non-cancéreux. Apr`es cette correction, l’expression de g`enes liés au stroma était plus élevée dans les MGs que dans leurs CPTs. Les différences d’expression entre N0 et N+ n’étaient pas reproductibles entre 4 jeux de données indépendants de CPTs. Ceci démontre l’absence d’un signal transcriptionnelle lié au statut nodal dans ces données. Cependant, en utilisant des données publiques comprenant des centaines de tumeurs, il est possible de prédire le statut nodal (N0 ou N+) des CPTs ainsi que des cancers du sein et du colon `a partir de leurs transcriptomes. Des études précédentes montraient des taux de prédiction presque parfaits (>90%) du statut nodal `a partir des données transcriptomiques. Nous avons décelés dans ces études le m^eme biais technique de sélection des g`enes, qui peut expliquer ces taux artificiellement élevés. Dans notre étude, ce biais n’était pas présent et la précision de nos prédictions était limitée (<70%), questionnant l’intér^et clinique de telles prédictions. La présence d’un signal permettant de prédire le statut nodal et l’irreproductibilité de ce signal dans des jeux de données indépendants peuvent s'expliquer par l’association entre le statut nodal et des caractéristiques d'agressivité des tumeurs, qui pourraient, elles, avoir une influence reproductible sur les transcriptomes. Dans notre derni`ere étude, nous avons analysé les différences entre CPTs, liées `a la présence de BRAFV600E, une mutation commune `a 60% des CPTs. En utilisant un jeu de données public, nous avons montré que les CPTs présentant la mutation étaient plus dédifférenciés, et plus infiltrés en stroma, probablement en lymphocytes et fibroblastes; et que ces CPTs présentaient plus de fibrose et proliféraient sans doute plus. Tout ceci sugg`ere que les CPTs mutés pour BRAF constituent un groupe de CPTs plus agressif. Des caractéristiques d’agressivité pourraient ^etre détectées au front invasif, c’est-`a-dire la périphérie de la tumeur définissant son contact avec le stroma, notamment la présence de regroupement de cellules isolées du reste de la tumeur. Dans les CPTs, ces ^ilots cellulaires isolés sont observés sur des lames histologiques 2D et pourraient ^etre expliqués soit par un détachement cellulaire, signe d’agressivité lié au processus métastatique, soit une conformation complexe compatible avec une tumeur connexe en 3D. Dans un CPT, nous avons analysé la conformation 3D du front invasif d'un CPT muté. Nous avons reconstruit son volume 3D gr^ace `a une méthode originale. Les groupes de cellules cancéreuses qui semblaient isolées sur les images 2D d’histopathologie, étaient en fait connectés en 3D. L’hypoth`ese de la présence de détachement cellulaire suite `a la transition épithélio-mésenchymateuse n’est donc pas requise pour expliquer la présence de ces ^ilots cellulaires en 2D. La forme 3D du front invasif impliquait une surface de contact entre tumeur et stroma bien plus importante qu'impliquée par la forme ellipso"ide habituellement décrite. Les fibroblastes participaient autant `a la création de la masse tumorale que les cellules cancéreuses, puisque ces deux groupes de cellules proliféraient `a la m^eme vitesse. A l'avenir, le séquenccage du matériel génétique de cellules individuelles facilitera notre interprétation des signaux génomiques et transcriptomiques, qui jusqu’alors provenaient de tissu complet, i.e. un mélange de populations de cellules tumorales, stromales et de contaminant. Une signature de radiation pourrait ^etre extraite des profils mutationnels de cellules individuelles exposées aux radiations et `a l’H2O2 in vitro et comparée `a la signature des CTPs post-Tchernobyl. Les cellules tumorales et stromales individuelles des MGs pourraient ^etre comparées aux cellules tumorales et stromales invividuelles des CPTs. De m^eme les cellules individuelles mutées pour BRAFV600E pourraient ^etre comparées aux cellules non mutées.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Cette th`ese en bioinformatique a été réalisée entre 2010 et 2015 dans le groupe du Pr. Vincent Detours `a l’Institut de Recherche Interdisciplinaire en Biologie Humaine et Moléculaire. Nous avons analysé des données génomiques et transcriptomiques provenant de carcinomes papillaires de la thyro"ide (CPTs) et leurs tissus non-cancéreux adjacents. La premi`ere partie étudiait les différences transcriptomiques entre CPTs post-Tchernobyl et CPTs sporadiques, et leur tissus non-cancéreux adjacents. Dans notre cohorte, les cas sporadiques étaient en moyenne et significativement un an plus jeunes. Apr`es un ajustement des données transcriptionnelles pour l'^age, pr`es de 400 g`enes étaient plus exprimés dans les tissus adjacents des patients exposés aux radiations. Cependant, nous n’avons pu détecter aucune surreprésentation de groupe de g`enes participant `a des fonctions biologiques connues. Il était possible de distinguer les cas sporadiques des cas post-Tchernobyl sur base des transcriptomes de leurs tissus adjacents, avec une précision de ~70%. Cette surexpression de g`enes dans les tissus non-cancéreux adjacents pourrait ^etre liée `a une radiosensibilité accrue dans le groupe des patients exposés aux radiations de Tchernobyl. Dans la deuxi`eme étude, nous avons intégré des données provenant des patients de la premi`ere partie, incluant les nombres de copies d'ADN des CPTs, le génotype de plus de 400.000 SNPs dans le sang et les données transcriptionnelles des CPTs et leurs tissus non-cancéreux adjacents. En reproduisant les résultats d'une étude précédente, nous avons retrouvé la région 7q11.23 dupliquée exclusivement dans un tiers des patients exposés aux radiations. Dans une étude indépendante, un autre groupe a montré que la duplication de cette région était plus fréquente dans une population de lignées cellulaires radiosensibles que dans la population humaine normale. Cependant, en analysant les transcriptomes des patients présentant cette duplication, nous n'avons pas détecté de différence d’expression des g`enes codés dans cette région génomique. En outre, aucun génotype de SNP n'était significativement lié `a l'exposition aux radiations. En conclusion, les résultats confirment qu'un tiers des CPTs post-Tchernobyl ont des traces d'un dég^at radio-sensibilsant dans leur ADN. Dans une troisi`eme étude, nous avons étudié les différences transcriptionnelles entre CPTs et leurs métastases ganglionnaires (MGs) associées, ainsi qu'entre des CPTs développant des MGs (N+) et des CPTs ne développant pas de MGs (N0). Des études précédentes comparant les MGs et leurs tumeurs associées impliquant d’autres organes ont montré une surexpression de g`enes dans les MGs, liés aux cellules immunitaires. Ce signal provient du tissu contaminant environnant les MGs. Pour se défaire de ce signal contaminant, d’autres études ont microdisséqué au laser les parties tumorales des MGs. Cependant, la microdissection retire aussi le stroma associé `a la tumeur, alors que celui-ci est justement impliqué dans la progression tumorale. Gr^ace `a une méthode originale, nous avons corrigé nos données d’expression des MGs pour leur contenu en contaminant ganglionnaire non-cancéreux. Apr`es cette correction, l’expression de g`enes liés au stroma était plus élevée dans les MGs que dans leurs CPTs. Les différences d’expression entre N0 et N+ n’étaient pas reproductibles entre 4 jeux de données indépendants de CPTs. Ceci démontre l’absence d’un signal transcriptionnelle lié au statut nodal dans ces données. Cependant, en utilisant des données publiques comprenant des centaines de tumeurs, il est possible de prédire le statut nodal (N0 ou N+) des CPTs ainsi que des cancers du sein et du colon `a partir de leurs transcriptomes. Des études précédentes montraient des taux de prédiction presque parfaits (>90%) du statut nodal `a partir des données transcriptomiques. Nous avons décelés dans ces études le m^eme biais technique de sélection des g`enes, qui peut expliquer ces taux artificiellement élevés. Dans notre étude, ce biais n’était pas présent et la précision de nos prédictions était limitée (<70%), questionnant l’intér^et clinique de telles prédictions. La présence d’un signal permettant de prédire le statut nodal et l’irreproductibilité de ce signal dans des jeux de données indépendants peuvent s'expliquer par l’association entre le statut nodal et des caractéristiques d'agressivité des tumeurs, qui pourraient, elles, avoir une influence reproductible sur les transcriptomes. Dans notre derni`ere étude, nous avons analysé les différences entre CPTs, liées `a la présence de BRAFV600E, une mutation commune `a 60% des CPTs. En utilisant un jeu de données public, nous avons montré que les CPTs présentant la mutation étaient plus dédifférenciés, et plus infiltrés en stroma, probablement en lymphocytes et fibroblastes; et que ces CPTs présentaient plus de fibrose et proliféraient sans doute plus. Tout ceci sugg`ere que les CPTs mutés pour BRAF constituent un groupe de CPTs plus agressif. Des caractéristiques d’agressivité pourraient ^etre détectées au front invasif, c’est-`a-dire la périphérie de la tumeur définissant son contact avec le stroma, notamment la présence de regroupement de cellules isolées du reste de la tumeur. Dans les CPTs, ces ^ilots cellulaires isolés sont observés sur des lames histologiques 2D et pourraient ^etre expliqués soit par un détachement cellulaire, signe d’agressivité lié au processus métastatique, soit une conformation complexe compatible avec une tumeur connexe en 3D. Dans un CPT, nous avons analysé la conformation 3D du front invasif d'un CPT muté. Nous avons reconstruit son volume 3D gr^ace `a une méthode originale. Les groupes de cellules cancéreuses qui semblaient isolées sur les images 2D d’histopathologie, étaient en fait connectés en 3D. L’hypoth`ese de la présence de détachement cellulaire suite `a la transition épithélio-mésenchymateuse n’est donc pas requise pour expliquer la présence de ces ^ilots cellulaires en 2D. La forme 3D du front invasif impliquait une surface de contact entre tumeur et stroma bien plus importante qu'impliquée par la forme ellipso"ide habituellement décrite. Les fibroblastes participaient autant `a la création de la masse tumorale que les cellules cancéreuses, puisque ces deux groupes de cellules proliféraient `a la m^eme vitesse. A l'avenir, le séquenccage du matériel génétique de cellules individuelles facilitera notre interprétation des signaux génomiques et transcriptomiques, qui jusqu’alors provenaient de tissu complet, i.e. un mélange de populations de cellules tumorales, stromales et de contaminant. Une signature de radiation pourrait ^etre extraite des profils mutationnels de cellules individuelles exposées aux radiations et `a l’H2O2 in vitro et comparée `a la signature des CTPs post-Tchernobyl. Les cellules tumorales et stromales individuelles des MGs pourraient ^etre comparées aux cellules tumorales et stromales invividuelles des CPTs. De m^eme les cellules individuelles mutées pour BRAFV600E pourraient ^etre comparées aux cellules non mutées. |
2015 |
Lopes, Miguel Inference of gene networks from time series expression data and application to type 1 Diabetes PhD Thesis 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/216729, title = {Inference of gene networks from time series expression data and application to type 1 Diabetes}, author = {Miguel Lopes}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/216729/6/contratGasparLopes.pdf}, year = {2015}, date = {2015-01-01}, abstract = {The inference of gene regulatory networks (GRN) is of great importance to medical research, as causal mechanisms responsible for phenotypes are unravelled and potential therapeutical targets identified. In type 1 diabetes, insulin producing pancreatic beta-cells are the target of an auto-immune attack leading to apoptosis (cell suicide). Although key genes and regulations have been identified, a precise characterization of the process leading to beta-cell apoptosis has not been achieved yet. The inference of relevant molecular pathways in type 1 diabetes is then a crucial research topic. GRN inference from gene expression data (obtained from microarrays and RNA-seq technology) is a causal inference problem which may be tackled with well-established statistical and machine learning concepts. In particular, the use of time series facilitates the identification of the causal direction in cause-effect gene pairs. However, inference from gene expression data is a very challenging problem due to the large number of existing genes (in human, over twenty thousand) and the typical low number of samples in gene expression datasets. In this context, it is important to correctly assess the accuracy of network inference methods. The contributions of this thesis are on three distinct aspects. The first is on inference assessment using precision-recall curves, in particular using the area under the curve (AUPRC). The typical approach to assess AUPRC significance is using Monte Carlo, and a parametric alternative is proposed. It consists on deriving the mean and variance of the null AUPRC and then using these parameters to fit a beta distribution approximating the true distribution. The second contribution is an investigation on network inference from time series. Several state of the art strategies are experimentally assessed and novel heuristics are proposed. One is a fast approximation of first order Granger causality scores, suited for GRN inference in the large variable case. Another identifies co-regulated genes (ie. regulated by the same genes). Both are experimentally validated using microarray and simulated time series. The third contribution of this thesis is on the context of type 1 diabetes and is a study on beta cell gene expression after exposure to cytokines, emulating the mechanisms leading to apoptosis. 8 datasets of beta cell gene expression were used to identify differentially expressed genes before and after 24h, which were functionally characterized using bioinformatics tools. The two most differentially expressed genes, previously unknown in the type 1 Diabetes literature (RIPK2 and ELF3) were found to modulate cytokine induced apoptosis. A regulatory network was then inferred using a dynamic adaptation of a state of the art network inference method. Three out of four predicted regulations (involving RIPK2 and ELF3) were experimentally confirmed, providing a proof of concept for the adopted approach.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The inference of gene regulatory networks (GRN) is of great importance to medical research, as causal mechanisms responsible for phenotypes are unravelled and potential therapeutical targets identified. In type 1 diabetes, insulin producing pancreatic beta-cells are the target of an auto-immune attack leading to apoptosis (cell suicide). Although key genes and regulations have been identified, a precise characterization of the process leading to beta-cell apoptosis has not been achieved yet. The inference of relevant molecular pathways in type 1 diabetes is then a crucial research topic. GRN inference from gene expression data (obtained from microarrays and RNA-seq technology) is a causal inference problem which may be tackled with well-established statistical and machine learning concepts. In particular, the use of time series facilitates the identification of the causal direction in cause-effect gene pairs. However, inference from gene expression data is a very challenging problem due to the large number of existing genes (in human, over twenty thousand) and the typical low number of samples in gene expression datasets. In this context, it is important to correctly assess the accuracy of network inference methods. The contributions of this thesis are on three distinct aspects. The first is on inference assessment using precision-recall curves, in particular using the area under the curve (AUPRC). The typical approach to assess AUPRC significance is using Monte Carlo, and a parametric alternative is proposed. It consists on deriving the mean and variance of the null AUPRC and then using these parameters to fit a beta distribution approximating the true distribution. The second contribution is an investigation on network inference from time series. Several state of the art strategies are experimentally assessed and novel heuristics are proposed. One is a fast approximation of first order Granger causality scores, suited for GRN inference in the large variable case. Another identifies co-regulated genes (ie. regulated by the same genes). Both are experimentally validated using microarray and simulated time series. The third contribution of this thesis is on the context of type 1 diabetes and is a study on beta cell gene expression after exposure to cytokines, emulating the mechanisms leading to apoptosis. 8 datasets of beta cell gene expression were used to identify differentially expressed genes before and after 24h, which were functionally characterized using bioinformatics tools. The two most differentially expressed genes, previously unknown in the type 1 Diabetes literature (RIPK2 and ELF3) were found to modulate cytokine induced apoptosis. A regulatory network was then inferred using a dynamic adaptation of a state of the art network inference method. Three out of four predicted regulations (involving RIPK2 and ELF3) were experimentally confirmed, providing a proof of concept for the adopted approach. |
Hajingabo, Leon 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209126, title = {Analyzing molecular network perturbations in human cancer: application to mutated genes and gene fusions involved in acute lymphoblastic leukemia}, author = {Leon Hajingabo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209126/3/d6d6225e-dfa8-46a3-985a-2ef19c71eff1.txt}, year = {2015}, date = {2015-01-01}, abstract = {Le séquenccage du génome humain et l'émergence de nouvelles technologies de génomique `a haut débit, ont initié de nouveaux mod`eles d'investigation pour l'analyse systématique des maladies humaines. Actuellement, nous pouvons tenter de comprendre les maladies tel que le cancer avec une perspective plus globale, en identifiant des g`enes responsables des cancers et en étudiant la mani`ere dont leurs produits protéiques fonctionnent dans un réseau d’interactions moléculaires. Dans ce contexte, nous avons collecté les g`enes spécifiquement liés `a la Leucémie Lymphoblastique Aigu"e (LLA), et identifié de nouveaux partenaires d'interaction qui relient ces g`enes clés associés `a la LLA tels que NOTCH1, FBW7, KRAS et PTPN11, dans un réseau d’interactions. Nous avons également tenté de prédire l’impact fonctionnel des variations génomiques tel que des fusions de g`enes impliquées dans LLA. En utilisant comme mod`eles trois différentes translocations chromosomiques ETV6-RUNX1 (TEL-AML1), BCR-ABL1, et E2A-PBX1 (TCF3-PBX1) fréquemment identifiées dans des cellules B LLA, nous avons adapté une approche de prédiction d’oncog`enes afin de prédire des perturbations moléculaires dans la LLA. Nous avons montré que les circuits transcriptomiques dépendant de Myc et JunD sont spécifiquement dérégulés suite aux fusions de g`enes TEL-AML1 et TCF3-PBX1, respectivement. Nous avons également identifié le mécanisme de transport des ARNm dépendant du facteur NXF1 comme une cible directe de la protéine de fusion TCF3-PBX1. Gr^ace `a cette approche combinant les données interactomiques et les analyses d'expression génique, nous avons fourni un nouvel aperccu `a la compréhension moléculaire de la Leucémie Lymphoblastique Aigu"e.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Le séquenccage du génome humain et l'émergence de nouvelles technologies de génomique `a haut débit, ont initié de nouveaux mod`eles d'investigation pour l'analyse systématique des maladies humaines. Actuellement, nous pouvons tenter de comprendre les maladies tel que le cancer avec une perspective plus globale, en identifiant des g`enes responsables des cancers et en étudiant la mani`ere dont leurs produits protéiques fonctionnent dans un réseau d’interactions moléculaires. Dans ce contexte, nous avons collecté les g`enes spécifiquement liés `a la Leucémie Lymphoblastique Aigu"e (LLA), et identifié de nouveaux partenaires d'interaction qui relient ces g`enes clés associés `a la LLA tels que NOTCH1, FBW7, KRAS et PTPN11, dans un réseau d’interactions. Nous avons également tenté de prédire l’impact fonctionnel des variations génomiques tel que des fusions de g`enes impliquées dans LLA. En utilisant comme mod`eles trois différentes translocations chromosomiques ETV6-RUNX1 (TEL-AML1), BCR-ABL1, et E2A-PBX1 (TCF3-PBX1) fréquemment identifiées dans des cellules B LLA, nous avons adapté une approche de prédiction d’oncog`enes afin de prédire des perturbations moléculaires dans la LLA. Nous avons montré que les circuits transcriptomiques dépendant de Myc et JunD sont spécifiquement dérégulés suite aux fusions de g`enes TEL-AML1 et TCF3-PBX1, respectivement. Nous avons également identifié le mécanisme de transport des ARNm dépendant du facteur NXF1 comme une cible directe de la protéine de fusion TCF3-PBX1. Gr^ace `a cette approche combinant les données interactomiques et les analyses d'expression génique, nous avons fourni un nouvel aperccu `a la compréhension moléculaire de la Leucémie Lymphoblastique Aigu"e. |
Pozzolo, Andrea Dal; Caelen, Olivier; Bontempi, Gianluca When is undersampling effective in unbalanced classification tasks? Book Chapter In: Springer, 2015, (Language of publication: en). @inbook{info:hdl:2013/221669, title = {When is undersampling effective in unbalanced classification tasks?}, author = {Andrea Dal Pozzolo and Olivier Caelen and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/221669}, year = {2015}, date = {2015-01-01}, publisher = {Springer}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } |
Pozzolo, Andrea Dal Adaptive Machine Learning for Credit Card Fraud Detection PhD Thesis 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/221654b, title = {Adaptive Machine Learning for Credit Card Fraud Detection}, author = {Andrea Dal Pozzolo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/221654/5/contratDalPozzolo.pdf}, year = {2015}, date = {2015-01-01}, abstract = {Billions of dollars of loss are caused every year by fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to the non-stationary distribution of the data, the highly unbalanced classes distributions and the availability of few transactions labeled by fraud investigators. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about what is the best strategy. In this thesis we aim to provide some answers by focusing on crucial issues such as: i) why and how undersampling is useful in the presence of class imbalance (i.e. frauds are a small percentage of the transactions), ii) how to deal with unbalanced and evolving data streams (non-stationarity due to fraud evolution and change of spending behavior), iii) how to assess performances in a way which is relevant for detection and iv) how to use feedbacks provided by investigators on the fraud alerts generated. Finally, we design and assess a prototype of a Fraud Detection System able to meet real-world working conditions and that is able to integrate investigators’ feedback to generate accurate alerts.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Billions of dollars of loss are caused every year by fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to the non-stationary distribution of the data, the highly unbalanced classes distributions and the availability of few transactions labeled by fraud investigators. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about what is the best strategy. In this thesis we aim to provide some answers by focusing on crucial issues such as: i) why and how undersampling is useful in the presence of class imbalance (i.e. frauds are a small percentage of the transactions), ii) how to deal with unbalanced and evolving data streams (non-stationarity due to fraud evolution and change of spending behavior), iii) how to assess performances in a way which is relevant for detection and iv) how to use feedbacks provided by investigators on the fraud alerts generated. Finally, we design and assess a prototype of a Fraud Detection System able to meet real-world working conditions and that is able to integrate investigators’ feedback to generate accurate alerts. |
Lopes, Miguel Inference of gene networks from time series expression data and application to type 1 Diabetes PhD Thesis 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/216729b, title = {Inference of gene networks from time series expression data and application to type 1 Diabetes}, author = {Miguel Lopes}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/216729/6/contratGasparLopes.pdf}, year = {2015}, date = {2015-01-01}, abstract = {The inference of gene regulatory networks (GRN) is of great importance to medical research, as causal mechanisms responsible for phenotypes are unravelled and potential therapeutical targets identified. In type 1 diabetes, insulin producing pancreatic beta-cells are the target of an auto-immune attack leading to apoptosis (cell suicide). Although key genes and regulations have been identified, a precise characterization of the process leading to beta-cell apoptosis has not been achieved yet. The inference of relevant molecular pathways in type 1 diabetes is then a crucial research topic. GRN inference from gene expression data (obtained from microarrays and RNA-seq technology) is a causal inference problem which may be tackled with well-established statistical and machine learning concepts. In particular, the use of time series facilitates the identification of the causal direction in cause-effect gene pairs. However, inference from gene expression data is a very challenging problem due to the large number of existing genes (in human, over twenty thousand) and the typical low number of samples in gene expression datasets. In this context, it is important to correctly assess the accuracy of network inference methods. The contributions of this thesis are on three distinct aspects. The first is on inference assessment using precision-recall curves, in particular using the area under the curve (AUPRC). The typical approach to assess AUPRC significance is using Monte Carlo, and a parametric alternative is proposed. It consists on deriving the mean and variance of the null AUPRC and then using these parameters to fit a beta distribution approximating the true distribution. The second contribution is an investigation on network inference from time series. Several state of the art strategies are experimentally assessed and novel heuristics are proposed. One is a fast approximation of first order Granger causality scores, suited for GRN inference in the large variable case. Another identifies co-regulated genes (ie. regulated by the same genes). Both are experimentally validated using microarray and simulated time series. The third contribution of this thesis is on the context of type 1 diabetes and is a study on beta cell gene expression after exposure to cytokines, emulating the mechanisms leading to apoptosis. 8 datasets of beta cell gene expression were used to identify differentially expressed genes before and after 24h, which were functionally characterized using bioinformatics tools. The two most differentially expressed genes, previously unknown in the type 1 Diabetes literature (RIPK2 and ELF3) were found to modulate cytokine induced apoptosis. A regulatory network was then inferred using a dynamic adaptation of a state of the art network inference method. Three out of four predicted regulations (involving RIPK2 and ELF3) were experimentally confirmed, providing a proof of concept for the adopted approach.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The inference of gene regulatory networks (GRN) is of great importance to medical research, as causal mechanisms responsible for phenotypes are unravelled and potential therapeutical targets identified. In type 1 diabetes, insulin producing pancreatic beta-cells are the target of an auto-immune attack leading to apoptosis (cell suicide). Although key genes and regulations have been identified, a precise characterization of the process leading to beta-cell apoptosis has not been achieved yet. The inference of relevant molecular pathways in type 1 diabetes is then a crucial research topic. GRN inference from gene expression data (obtained from microarrays and RNA-seq technology) is a causal inference problem which may be tackled with well-established statistical and machine learning concepts. In particular, the use of time series facilitates the identification of the causal direction in cause-effect gene pairs. However, inference from gene expression data is a very challenging problem due to the large number of existing genes (in human, over twenty thousand) and the typical low number of samples in gene expression datasets. In this context, it is important to correctly assess the accuracy of network inference methods. The contributions of this thesis are on three distinct aspects. The first is on inference assessment using precision-recall curves, in particular using the area under the curve (AUPRC). The typical approach to assess AUPRC significance is using Monte Carlo, and a parametric alternative is proposed. It consists on deriving the mean and variance of the null AUPRC and then using these parameters to fit a beta distribution approximating the true distribution. The second contribution is an investigation on network inference from time series. Several state of the art strategies are experimentally assessed and novel heuristics are proposed. One is a fast approximation of first order Granger causality scores, suited for GRN inference in the large variable case. Another identifies co-regulated genes (ie. regulated by the same genes). Both are experimentally validated using microarray and simulated time series. The third contribution of this thesis is on the context of type 1 diabetes and is a study on beta cell gene expression after exposure to cytokines, emulating the mechanisms leading to apoptosis. 8 datasets of beta cell gene expression were used to identify differentially expressed genes before and after 24h, which were functionally characterized using bioinformatics tools. The two most differentially expressed genes, previously unknown in the type 1 Diabetes literature (RIPK2 and ELF3) were found to modulate cytokine induced apoptosis. A regulatory network was then inferred using a dynamic adaptation of a state of the art network inference method. Three out of four predicted regulations (involving RIPK2 and ELF3) were experimentally confirmed, providing a proof of concept for the adopted approach. |
Lerman, Liran A machine learning approach for automatic and generic side-channel attacks PhD Thesis 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209070, title = {A machine learning approach for automatic and generic side-channel attacks}, author = {Liran Lerman}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209070/2/be487c5b-7b94-414c-bf2e-96847aa98284.txt}, year = {2015}, date = {2015-01-01}, abstract = {L'omniprésence de dispositifs interconnectés am`ene `a un intér^et massif pour la sécurité informatique fournie entre autres par le domaine de la cryptographie. Pendant des décennies, les spécialistes en cryptographie estimaient le niveau de sécurité d'un algorithme cryptographique indépendamment de son implantation dans un dispositif. Cependant, depuis la publication des attaques d'implantation en 1996, les attaques physiques sont devenues un domaine de recherche actif en considérant les propriétés physiques de dispositifs cryptographiques. Dans notre dissertation, nous nous concentrons sur les attaques profilées. Traditionnellement, les attaques profilées appliquent des méthodes paramétriques dans lesquelles une information a priori sur les propriétés physiques est supposée. Le domaine de l'apprentissage automatique produit des mod`eles automatiques et génériques ne nécessitant pas une information a priori sur le phénom`ene étudié. Cette dissertation apporte un éclairage nouveau sur les capacités des méthodes d'apprentissage automatique. Nous démontrons d'abord que les attaques profilées paramétriques surpassent les méthodes d'apprentissage automatique lorsqu'il n'y a pas d'erreur d'estimation ni d'hypoth`ese. En revanche, les attaques fondées sur l'apprentissage automatique sont avantageuses dans des scénarios réalistes o`u le nombre de données lors de l'étape d'apprentissage est faible. Par la suite, nous proposons une nouvelle métrique formelle d'évaluation qui permet (1) de comparer des attaques paramétriques et non-paramétriques et (2) d'interpréter les résultats de chaque méthode. La nouvelle mesure fournit les causes d'un taux de réussite élevé ou faible d'une attaque et, par conséquent, donne des pistes pour améliorer l'évaluation d'une implantation. Enfin, nous présentons des résultats expérimentaux sur des appareils non protégés et protégés. La premi`ere étude montre que l'apprentissage automatique a un taux de réussite plus élevé qu'une méthode paramétrique lorsque seules quelques données sont disponibles. La deuxi`eme expérience démontre qu'un dispositif protégé est attaquable avec une approche appartenant `a l'apprentissage automatique. La stratégie basée sur l'apprentissage automatique nécessite le m^eme nombre de données lors de la phase d'apprentissage que lorsque celle-ci attaque un produit non protégé. Nous montrons également que des méthodes paramétriques surestiment ou sous-estiment le niveau de sécurité fourni par l'appareil alors que l'approche basée sur l'apprentissage automatique améliore cette estimation. En résumé, notre th`ese est que les attaques basées sur l'apprentissage automatique sont avantageuses par rapport aux techniques classiques lorsque la quantité d'information a priori sur l'appareil cible et le nombre de données lors de la phase d'apprentissage sont faibles.}, L'omniprésence de dispositifs interconnectés am`ene `a un intér^et massif pour la sécurité informatique fournie entre autres par le domaine de la cryptographie. Pendant des décennies, les spécialistes en cryptographie estimaient le niveau de sécurité d'un algorithme cryptographique indépendamment de son implantation dans un dispositif. Cependant, depuis la publication des attaques d'implantation en 1996, les attaques physiques sont devenues un domaine de recherche actif en considérant les propriétés physiques de dispositifs cryptographiques. Dans notre dissertation, nous nous concentrons sur les attaques profilées. Traditionnellement, les attaques profilées appliquent des méthodes paramétriques dans lesquelles une information a priori sur les propriétés physiques est supposée. Le domaine de l'apprentissage automatique produit des mod`eles automatiques et génériques ne nécessitant pas une information a priori sur le phénom`ene étudié.<p><p>Cette dissertation apporte un éclairage nouveau sur les capacités des méthodes d'apprentissage automatique. Nous démontrons d'abord que les attaques profilées paramétriques surpassent les méthodes d'apprentissage automatique lorsqu'il n'y a pas d'erreur d'estimation ni d'hypoth`ese. En revanche, les attaques fondées sur l'apprentissage automatique sont avantageuses dans des scénarios réalistes o`u le nombre de données lors de l'étape d'apprentissage est faible. Par la suite, nous proposons une nouvelle métrique formelle d'évaluation qui permet (1) de comparer des attaques paramétriques et non-paramétriques et (2) d'interpréter les résultats de chaque méthode. La nouvelle mesure fournit les causes d'un taux de réussite élevé ou faible d'une attaque et, par conséquent, donne des pistes pour améliorer l'évaluation d'une implantation. Enfin, nous présentons des résultats expérimentaux sur des appareils non protégés et protégés. La premi`ere étude montre que l'apprentissage automatique a un taux de réussite plus élevé qu'une méthode paramétrique lorsque seules quelques données sont disponibles. La deuxi`eme expérience démontre qu'un dispositif protégé est attaquable avec une approche appartenant `a l'apprentissage automatique. La stratégie basée sur l'apprentissage automatique nécessite le m^eme nombre de données lors de la phase d'apprentissage que lorsque celle-ci attaque un produit non protégé. Nous montrons également que des méthodes paramétriques surestiment ou sous-estiment le niveau de sécurité fourni par l'appareil alors que l'approche basée sur l'apprentissage automatique améliore cette estimation. <p><p>En résumé, notre th`ese est que les attaques basées sur l'apprentissage automatique sont avantageuses par rapport aux techniques classiques lorsque la quantité d'information a priori sur l'appareil cible et le nombre de données lors de la phase d'apprentissage sont faibles. |
Hajingabo, Leon 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209126b, title = {Analyzing molecular network perturbations in human cancer: application to mutated genes and gene fusions involved in acute lymphoblastic leukemia}, author = {Leon Hajingabo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209126/3/d6d6225e-dfa8-46a3-985a-2ef19c71eff1.txt}, year = {2015}, date = {2015-01-01}, abstract = {Le séquenccage du génome humain et l'émergence de nouvelles technologies de génomique `a haut débit, ont initié de nouveaux mod`eles d'investigation pour l'analyse systématique des maladies humaines. Actuellement, nous pouvons tenter de comprendre les maladies tel que le cancer avec une perspective plus globale, en identifiant des g`enes responsables des cancers et en étudiant la mani`ere dont leurs produits protéiques fonctionnent dans un réseau d’interactions moléculaires. Dans ce contexte, nous avons collecté les g`enes spécifiquement liés `a la Leucémie Lymphoblastique Aigu"e (LLA), et identifié de nouveaux partenaires d'interaction qui relient ces g`enes clés associés `a la LLA tels que NOTCH1, FBW7, KRAS et PTPN11, dans un réseau d’interactions. Nous avons également tenté de prédire l’impact fonctionnel des variations génomiques tel que des fusions de g`enes impliquées dans LLA. En utilisant comme mod`eles trois différentes translocations chromosomiques ETV6-RUNX1 (TEL-AML1), BCR-ABL1, et E2A-PBX1 (TCF3-PBX1) fréquemment identifiées dans des cellules B LLA, nous avons adapté une approche de prédiction d’oncog`enes afin de prédire des perturbations moléculaires dans la LLA. Nous avons montré que les circuits transcriptomiques dépendant de Myc et JunD sont spécifiquement dérégulés suite aux fusions de g`enes TEL-AML1 et TCF3-PBX1, respectivement. Nous avons également identifié le mécanisme de transport des ARNm dépendant du facteur NXF1 comme une cible directe de la protéine de fusion TCF3-PBX1. Gr^ace `a cette approche combinant les données interactomiques et les analyses d'expression génique, nous avons fourni un nouvel aperccu `a la compréhension moléculaire de la Leucémie Lymphoblastique Aigu"e.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Le séquenccage du génome humain et l'émergence de nouvelles technologies de génomique `a haut débit, ont initié de nouveaux mod`eles d'investigation pour l'analyse systématique des maladies humaines. Actuellement, nous pouvons tenter de comprendre les maladies tel que le cancer avec une perspective plus globale, en identifiant des g`enes responsables des cancers et en étudiant la mani`ere dont leurs produits protéiques fonctionnent dans un réseau d’interactions moléculaires. Dans ce contexte, nous avons collecté les g`enes spécifiquement liés `a la Leucémie Lymphoblastique Aigu"e (LLA), et identifié de nouveaux partenaires d'interaction qui relient ces g`enes clés associés `a la LLA tels que NOTCH1, FBW7, KRAS et PTPN11, dans un réseau d’interactions. Nous avons également tenté de prédire l’impact fonctionnel des variations génomiques tel que des fusions de g`enes impliquées dans LLA. En utilisant comme mod`eles trois différentes translocations chromosomiques ETV6-RUNX1 (TEL-AML1), BCR-ABL1, et E2A-PBX1 (TCF3-PBX1) fréquemment identifiées dans des cellules B LLA, nous avons adapté une approche de prédiction d’oncog`enes afin de prédire des perturbations moléculaires dans la LLA. Nous avons montré que les circuits transcriptomiques dépendant de Myc et JunD sont spécifiquement dérégulés suite aux fusions de g`enes TEL-AML1 et TCF3-PBX1, respectivement. Nous avons également identifié le mécanisme de transport des ARNm dépendant du facteur NXF1 comme une cible directe de la protéine de fusion TCF3-PBX1. Gr^ace `a cette approche combinant les données interactomiques et les analyses d'expression génique, nous avons fourni un nouvel aperccu `a la compréhension moléculaire de la Leucémie Lymphoblastique Aigu"e. |
Huculeci, Radu; Garcia-Pino, Abel; Buts, Lieven; Lenaerts, Tom; van Nuland, Nico A J Structural insights into the intertwined dimer of fyn SH2. Journal Article In: Protein science, 24 (12), pp. 1964-1978, 2015, (DOI: 10.1002/pro.2806). @article{info:hdl:2013/225650, title = {Structural insights into the intertwined dimer of fyn SH2.}, author = {Radu Huculeci and Abel Garcia-Pino and Lieven Buts and Tom Lenaerts and Nico A J van Nuland}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/225650/3/225650.pdf}, year = {2015}, date = {2015-01-01}, journal = {Protein science}, volume = {24}, number = {12}, pages = {1964-1978}, abstract = {Src homology 2 domains are interaction modules dedicated to the recognition of phosphotyrosine sites incorporated in numerous proteins found in intracellular signaling pathways. Here we provide for the first time structural insight into the dimerization of Fyn SH2 both in solution and in crystalline conditions, providing novel crystal structures of both the dimer and peptide-bound structures of Fyn SH2. Using nuclear magnetic resonance chemical shift analysis, we show how the peptide is able to eradicate the dimerization, leading to monomeric SH2 in its bound state. Furthermore, we show that Fyn SH2's dimer form differs from other SH2 dimers reported earlier. Interestingly, the Fyn dimer can be used to construct a completed dimer model of Fyn without any steric clashes. Together these results extend our understanding of SH2 dimerization, giving structural details, on one hand, and suggesting a possible physiological relevance of such behavior, on the other hand.}, note = {DOI: 10.1002/pro.2806}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src homology 2 domains are interaction modules dedicated to the recognition of phosphotyrosine sites incorporated in numerous proteins found in intracellular signaling pathways. Here we provide for the first time structural insight into the dimerization of Fyn SH2 both in solution and in crystalline conditions, providing novel crystal structures of both the dimer and peptide-bound structures of Fyn SH2. Using nuclear magnetic resonance chemical shift analysis, we show how the peptide is able to eradicate the dimerization, leading to monomeric SH2 in its bound state. Furthermore, we show that Fyn SH2's dimer form differs from other SH2 dimers reported earlier. Interestingly, the Fyn dimer can be used to construct a completed dimer model of Fyn without any steric clashes. Together these results extend our understanding of SH2 dimerization, giving structural details, on one hand, and suggesting a possible physiological relevance of such behavior, on the other hand. |
Zisis, Ioannis; Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom Generocity motivated by acceptance - evolutionary analysis of an anticipation game Journal Article In: Scientific Reports, 5 , 2015, (Language of publication: en). @article{info:hdl:2013/228071, title = {Generocity motivated by acceptance - evolutionary analysis of an anticipation game}, author = {Ioannis Zisis and Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/228071}, year = {2015}, date = {2015-01-01}, journal = {Scientific Reports}, volume = {5}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Gazzo, Andrea; Daneels, Dorien; Cilia, Elisa; Bonduelle, Maryse; Abramowicz, Marc; Dooren, Sonia Van; Smits, Guillaume; Lenaerts, Tom DIDA: A curated and annotated digenic diseases database. Journal Article In: Nucleic acids research, 2015, (DOI: 10.1093/nar/gkv1068). @article{info:hdl:2013/220549, title = {DIDA: A curated and annotated digenic diseases database.}, author = {Andrea Gazzo and Dorien Daneels and Elisa Cilia and Maryse Bonduelle and Marc Abramowicz and Sonia Van Dooren and Guillaume Smits and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/220549}, year = {2015}, date = {2015-01-01}, journal = {Nucleic acids research}, abstract = {DIDA (DIgenic diseases DAtabase) is a novel database that provides for the first time detailed information on genes and associated genetic variants involved in digenic diseases, the simplest form of oligogenic inheritance. The database is accessible via http://dida.ibsquare.be and currently includes 213 digenic combinations involved in 44 different digenic diseases. These combinations are composed of 364 distinct variants, which are distributed over 136 distinct genes. The web interface provides browsing and search functionalities, as well as documentation and help pages, general database statistics and references to the original publications from which the data have been collected. The possibility to submit novel digenic data to DIDA is also provided. Creating this new repository was essential as current databases do not allow one to retrieve detailed records regarding digenic combinations. Genes, variants, diseases and digenic combinations in DIDA are annotated with manually curated information and information mined from other online resources. Next to providing a unique resource for the development of new analysis methods, DIDA gives clinical and molecular geneticists a tool to find the most comprehensive information on the digenic nature of their diseases of interest.}, note = {DOI: 10.1093/nar/gkv1068}, keywords = {}, pubstate = {published}, tppubtype = {article} } DIDA (DIgenic diseases DAtabase) is a novel database that provides for the first time detailed information on genes and associated genetic variants involved in digenic diseases, the simplest form of oligogenic inheritance. The database is accessible via http://dida.ibsquare.be and currently includes 213 digenic combinations involved in 44 different digenic diseases. These combinations are composed of 364 distinct variants, which are distributed over 136 distinct genes. The web interface provides browsing and search functionalities, as well as documentation and help pages, general database statistics and references to the original publications from which the data have been collected. The possibility to submit novel digenic data to DIDA is also provided. Creating this new repository was essential as current databases do not allow one to retrieve detailed records regarding digenic combinations. Genes, variants, diseases and digenic combinations in DIDA are annotated with manually curated information and information mined from other online resources. Next to providing a unique resource for the development of new analysis methods, DIDA gives clinical and molecular geneticists a tool to find the most comprehensive information on the digenic nature of their diseases of interest. |
Han, The Anh T A H; Pereira, Luís Moniz; Santos, Francisco C; Lenaerts, Tom Emergence of cooperation via intention recognition, commitment and apology-A research summary Journal Article In: AI communications, 28 (4), pp. 709-715, 2015, (DOI: 10.3233/AIC-150672). @article{info:hdl:2013/220724, title = {Emergence of cooperation via intention recognition, commitment and apology-A research summary}, author = {The Anh T A H Han and Luís Moniz Pereira and Francisco C Santos and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/220724}, year = {2015}, date = {2015-01-01}, journal = {AI communications}, volume = {28}, number = {4}, pages = {709-715}, abstract = {The mechanisms of emergence and evolution of cooperation in populations of abstract individuals, with diverse behavioral strategies in co-presence, have been undergoing mathematical study via evolutionary game theory, inspired in part on evolutionary psychology. Their systematic study resorts to simulation techniques, thus enabling the study of aforesaid mechanisms under a variety of conditions, parameters and alternative virtual games. The theoretical and experimental results have continually been surprising, rewarding and promising. In our recent work, we initiated the introduction, in such groups of individuals, of cognitive abilities inspired on techniques and theories of Artificial Intelligence, namely those pertaining to Intention Recognition, Commitment and Apology (separately and jointly), encompassing errors in decision-making and communication noise. As a result, both the emergence and stability of cooperation become reinforced comparatively to the absence of such cognitive abilities. This holds separately for Intention Recognition, for Commitment and for Apology, and even more so when they are jointly engaged. Our presentation aims to sensitize the reader to these evolutionary game theory based issues, results and prospects, which are accruing in importance for the modeling of minds with machines, with impact on our understanding of the evolution of mutual tolerance and cooperation. Recognition of someone's intentions, which may include imagining the recognition others have of our own intentions, and may comprise not just some error tolerance, but also a penalty for unfulfilled commitment though allowing for apology, can lead to evolutionary stable win/win equilibriums within groups of individuals, and perhaps amongst groups. The recognition and the manifestation of intentions, plus the assumption of commitment-even whilst paying a cost for putting it in place-and the acceptance of apology, are all facilitators in that respect, each of them singly and, above all, in collusion.}, note = {DOI: 10.3233/AIC-150672}, keywords = {}, pubstate = {published}, tppubtype = {article} } The mechanisms of emergence and evolution of cooperation in populations of abstract individuals, with diverse behavioral strategies in co-presence, have been undergoing mathematical study via evolutionary game theory, inspired in part on evolutionary psychology. Their systematic study resorts to simulation techniques, thus enabling the study of aforesaid mechanisms under a variety of conditions, parameters and alternative virtual games. The theoretical and experimental results have continually been surprising, rewarding and promising. In our recent work, we initiated the introduction, in such groups of individuals, of cognitive abilities inspired on techniques and theories of Artificial Intelligence, namely those pertaining to Intention Recognition, Commitment and Apology (separately and jointly), encompassing errors in decision-making and communication noise. As a result, both the emergence and stability of cooperation become reinforced comparatively to the absence of such cognitive abilities. This holds separately for Intention Recognition, for Commitment and for Apology, and even more so when they are jointly engaged. Our presentation aims to sensitize the reader to these evolutionary game theory based issues, results and prospects, which are accruing in importance for the modeling of minds with machines, with impact on our understanding of the evolution of mutual tolerance and cooperation. Recognition of someone's intentions, which may include imagining the recognition others have of our own intentions, and may comprise not just some error tolerance, but also a penalty for unfulfilled commitment though allowing for apology, can lead to evolutionary stable win/win equilibriums within groups of individuals, and perhaps amongst groups. The recognition and the manifestation of intentions, plus the assumption of commitment-even whilst paying a cost for putting it in place-and the acceptance of apology, are all facilitators in that respect, each of them singly and, above all, in collusion. |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Pereira, Luís Marcelo; Lenaerts, Tom Apology and forgiveness evolve to resolve failures in cooperative agreements Journal Article In: Scientific reports, 5 , 2015, (DOI: 10.1038/srep10639). @article{info:hdl:2013/205370, title = {Apology and forgiveness evolve to resolve failures in cooperative agreements}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Luís Marcelo Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/205370}, year = {2015}, date = {2015-01-01}, journal = {Scientific reports}, volume = {5}, abstract = {Making agreements on how to behave has been shown to be an evolutionarily viable strategy in one-shot social dilemmas. However, in many situations agreements aim to establish long-term mutually beneficial interactions. Our analytical and numerical results reveal for the first time under which conditions revenge, apology and forgiveness can evolve and deal with mistakes within ongoing agreements in the context of the Iterated Prisoners Dilemma. We show that, when the agreement fails, participants prefer to take revenge by defecting in the subsisting encounters. Incorporating costly apology and forgiveness reveals that, even when mistakes are frequent, there exists a sincerity threshold for which mistakes will not lead to the destruction of the agreement, inducing even higher levels of cooperation. In short, even when to err is human, revenge, apology and forgiveness are evolutionarily viable strategies which play an important role in inducing cooperation in repeated dilemmas.}, note = {DOI: 10.1038/srep10639}, keywords = {}, pubstate = {published}, tppubtype = {article} } Making agreements on how to behave has been shown to be an evolutionarily viable strategy in one-shot social dilemmas. However, in many situations agreements aim to establish long-term mutually beneficial interactions. Our analytical and numerical results reveal for the first time under which conditions revenge, apology and forgiveness can evolve and deal with mistakes within ongoing agreements in the context of the Iterated Prisoners Dilemma. We show that, when the agreement fails, participants prefer to take revenge by defecting in the subsisting encounters. Incorporating costly apology and forgiveness reveals that, even when mistakes are frequent, there exists a sincerity threshold for which mistakes will not lead to the destruction of the agreement, inducing even higher levels of cooperation. In short, even when to err is human, revenge, apology and forgiveness are evolutionarily viable strategies which play an important role in inducing cooperation in repeated dilemmas. |
Han, The Anh T A H; Santos, Francisco C; Lenaerts, Tom; Pereira, Luís Marcelo Synergy between intention recognition and commitments in cooperation dilemmas Journal Article In: Scientific Reports, 5 , 2015, (DOI: 10.1038/srep09312). @article{info:hdl:2013/197743, title = {Synergy between intention recognition and commitments in cooperation dilemmas}, author = {The Anh T A H Han and Francisco C Santos and Tom Lenaerts and Luís Marcelo Pereira}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/197743}, year = {2015}, date = {2015-01-01}, journal = {Scientific Reports}, volume = {5}, abstract = {Commitments have been shown to promote cooperation if, on the one hand, they can be sufficiently enforced, and on the other hand, the cost of arranging them is justified with respect to the benefits of cooperation. When either of these constraints is not met it leads to the prevalence of commitment free-riders, such as those who commit only when someone else pays to arrange the commitments. Here, we show how intention recognition may circumvent such weakness of costly commitments. We describe an evolutionary model, in the context of the one-shot Prisoner's Dilemma, showing that if players first predict the intentions of their co-player and propose a commitment only when they are not confident enough about their prediction, the chances of reaching mutual cooperation are largely enhanced. We find that an advantageous synergy between intention recognition and costly commitments depends strongly on the confidence and accuracy of intention recognition. In general, we observe an intermediate level of confidence threshold leading to the highest evolutionary advantage, showing that neither unconditional use of commitment nor intention recognition can perform optimally. Rather, our results show that arranging commitments is not always desirable, but that they may be also unavoidable depending on the strength of the dilemma.}, note = {DOI: 10.1038/srep09312}, keywords = {}, pubstate = {published}, tppubtype = {article} } Commitments have been shown to promote cooperation if, on the one hand, they can be sufficiently enforced, and on the other hand, the cost of arranging them is justified with respect to the benefits of cooperation. When either of these constraints is not met it leads to the prevalence of commitment free-riders, such as those who commit only when someone else pays to arrange the commitments. Here, we show how intention recognition may circumvent such weakness of costly commitments. We describe an evolutionary model, in the context of the one-shot Prisoner's Dilemma, showing that if players first predict the intentions of their co-player and propose a commitment only when they are not confident enough about their prediction, the chances of reaching mutual cooperation are largely enhanced. We find that an advantageous synergy between intention recognition and costly commitments depends strongly on the confidence and accuracy of intention recognition. In general, we observe an intermediate level of confidence threshold leading to the highest evolutionary advantage, showing that neither unconditional use of commitment nor intention recognition can perform optimally. Rather, our results show that arranging commitments is not always desirable, but that they may be also unavoidable depending on the strength of the dilemma. |
Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Avoiding or restricting defectors in public goods games? Journal Article In: Journal of the Royal Society interface, 12 (103), pp. 20141203, 2015, (DOI: 10.1098/rsif.2014.1203). @article{info:hdl:2013/205981, title = {Avoiding or restricting defectors in public goods games?}, author = {The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/205981}, year = {2015}, date = {2015-01-01}, journal = {Journal of the Royal Society interface}, volume = {12}, number = {103}, pages = {20141203}, abstract = {When creating a public good, strategies or mechanisms are required to handle defectors. We first show mathematically and numerically that prior agreements with posterior compensations provide a strategic solution that leads to substantial levels of cooperation in the context of public goods games, results that are corroborated by available experimental data. Notwithstanding this success, one cannot, as with other approaches, fully exclude the presence of defectors, raising the question of how they can be dealt with to avoid the demise of the common good. We show that both avoiding creation of the common good, whenever full agreement is not reached, and limiting the benefit that disagreeing defectors can acquire, using costly restriction mechanisms, are relevant choices. Nonetheless, restriction mechanisms are found the more favourable, especially in larger group interactions. Given decreasing restriction costs, introducing restraining measures to cope with public goods free-riding issues is the ultimate advantageous solution for all participants, rather than avoiding its creation.}, note = {DOI: 10.1098/rsif.2014.1203}, keywords = {}, pubstate = {published}, tppubtype = {article} } When creating a public good, strategies or mechanisms are required to handle defectors. We first show mathematically and numerically that prior agreements with posterior compensations provide a strategic solution that leads to substantial levels of cooperation in the context of public goods games, results that are corroborated by available experimental data. Notwithstanding this success, one cannot, as with other approaches, fully exclude the presence of defectors, raising the question of how they can be dealt with to avoid the demise of the common good. We show that both avoiding creation of the common good, whenever full agreement is not reached, and limiting the benefit that disagreeing defectors can acquire, using costly restriction mechanisms, are relevant choices. Nonetheless, restriction mechanisms are found the more favourable, especially in larger group interactions. Given decreasing restriction costs, introducing restraining measures to cope with public goods free-riding issues is the ultimate advantageous solution for all participants, rather than avoiding its creation. |
Sekara, Mateusz; Kowalski, Michael; Byrski, Aleksander; Indurkhya, Bipin; Kisiel-Dorohinicki, Marek; Samson, Dana; Lenaerts, Tom Multi-pheromone ant Colony Optimization for Socio-cognitive Simulation Purposes Journal Article In: Procedia Computer Science, 51 , pp. 954-963, 2015, (DOI: 10.1016/j.procs.2015.05.234). @article{info:hdl:2013/206321, title = {Multi-pheromone ant Colony Optimization for Socio-cognitive Simulation Purposes}, author = {Mateusz Sekara and Michael Kowalski and Aleksander Byrski and Bipin Indurkhya and Marek Kisiel-Dorohinicki and Dana Samson and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/206321/1/Elsevier_189948.pdf}, year = {2015}, date = {2015-01-01}, journal = {Procedia Computer Science}, volume = {51}, pages = {954-963}, abstract = {We present an application of Ant Colony Optimisation (ACO) to simulate socio-cognitive features of a population. We incorporated perspective taking ability to generate three different proportions of ant colonies: Control Sample, High Altercentricity Sample, and Low Altercentricity Sample. We simulated their performances on the Travelling Salesman Problem and compared them with the classic ACO. Results show that all three 'cognitively enabled' ant colonies require less time than the classic ACO. Also, though the best solution is found by the classic ACO, the Control Sample finds almost as good a solution but much faster. This study is offered as an example to illustrate an easy way of defining inter-individual interactions based on stigmergic features of the environment.}, note = {DOI: 10.1016/j.procs.2015.05.234}, keywords = {}, pubstate = {published}, tppubtype = {article} } We present an application of Ant Colony Optimisation (ACO) to simulate socio-cognitive features of a population. We incorporated perspective taking ability to generate three different proportions of ant colonies: Control Sample, High Altercentricity Sample, and Low Altercentricity Sample. We simulated their performances on the Travelling Salesman Problem and compared them with the classic ACO. Results show that all three 'cognitively enabled' ant colonies require less time than the classic ACO. Also, though the best solution is found by the classic ACO, the Control Sample finds almost as good a solution but much faster. This study is offered as an example to illustrate an easy way of defining inter-individual interactions based on stigmergic features of the environment. |
Colaprico, Antonio; Silva, Tiago Da; Olsen, Catharina; Garofano, Luciano; Cava, Claudia; Garolini, Davide; Sabedot, Thais TS; Malta, Tathiane TM; Pagnotta, Stefano SM; Castiglioni, Isabella; Ceccarelli, M; Bontempi, Gianluca; Noushmehr, Houtan TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Journal Article In: Nucleic acids research, 2015, (DOI: 10.1093/nar/gkv1507). @article{info:hdl:2013/222877, title = {TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.}, author = {Antonio Colaprico and Tiago Da Silva and Catharina Olsen and Luciano Garofano and Claudia Cava and Davide Garolini and Thais TS Sabedot and Tathiane TM Malta and Stefano SM Pagnotta and Isabella Castiglioni and M Ceccarelli and Gianluca Bontempi and Houtan Noushmehr}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/222877}, year = {2015}, date = {2015-01-01}, journal = {Nucleic acids research}, abstract = {The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.}, note = {DOI: 10.1093/nar/gkv1507}, keywords = {}, pubstate = {published}, tppubtype = {article} } The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries. |
Han, The Anh T A H; Lenaerts, Tom The efficient interaction of costly punishment and commitment Journal Article In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 3 , pp. 1657-1658, 2015, (Language of publication: en). @article{info:hdl:2013/220750, title = {The efficient interaction of costly punishment and commitment}, author = {The Anh T A H Han and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/220750}, year = {2015}, date = {2015-01-01}, journal = {Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS}, volume = {3}, pages = {1657-1658}, abstract = {To ensure cooperation in the Prisoner's Dilemma, agents may require prior commitments from others, subject to compensations when defecting after agreeing to commit. Alternatively, agents may prefer to behave reactively, without arranging prior commitments, by simply punishing those who misbehave. These two mechanisms have been shown to promote the emergence of cooperation, yet are complementary in the way they aim to instigate cooperation. In this work, using Evolutionary Game Theory, we describe a computational model showing that there is a wide range of parameters where the combined strategy is better than either strategy by itself, leading to a significantly higher level of cooperation. Interestingly, the improvement is most significant when the cost of arranging commitments is sufficiently high and the penalty reaches a certain threshold, thereby overcoming the weaknesses of both strategies.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } To ensure cooperation in the Prisoner's Dilemma, agents may require prior commitments from others, subject to compensations when defecting after agreeing to commit. Alternatively, agents may prefer to behave reactively, without arranging prior commitments, by simply punishing those who misbehave. These two mechanisms have been shown to promote the emergence of cooperation, yet are complementary in the way they aim to instigate cooperation. In this work, using Evolutionary Game Theory, we describe a computational model showing that there is a wide range of parameters where the combined strategy is better than either strategy by itself, leading to a significantly higher level of cooperation. Interestingly, the improvement is most significant when the cost of arranging commitments is sufficiently high and the penalty reaches a certain threshold, thereby overcoming the weaknesses of both strategies. |
Lerman, Liran; Bontempi, Gianluca; Markowitch, Olivier The bias–variance decomposition in profiled attacks Journal Article In: Journal of Cryptographic Engineering, 5 (4), pp. 255-267, 2015, (DOI: 10.1007/s13389-015-0106-1). @article{info:hdl:2013/220673, title = {The bias–variance decomposition in profiled attacks}, author = {Liran Lerman and Gianluca Bontempi and Olivier Markowitch}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/220673}, year = {2015}, date = {2015-01-01}, journal = {Journal of Cryptographic Engineering}, volume = {5}, number = {4}, pages = {255-267}, abstract = {The profiled attacks challenge the security of cryptographic devices in the worst case scenario. We elucidate the reasons underlying the success of different profiled attacks (that depend essentially on the context) based on the well-known bias–variance tradeoff developed in the machine learning field. Note that our approach can easily be extended to non-profiled attacks. We show (1) how to decompose (in three additive components) the error rate of an attack based on the bias–variance decomposition, and (2) how to reduce the error rate of a model based on the bias–variance diagnostic. Intuitively, we show that different models having the same error rate require different strategies (according to the bias–variance decomposition) to reduce their errors. More precisely, the success rate of a strategy depends on several criteria such as its complexity, the leakage information and the number of points per trace. As a result, a suboptimal strategy in a specific context can lead the adversary to overestimate the security level of the cryptographic device. Our results also bring warnings related to the estimation of the success rate of a profiled attack that can lead the evaluator to underestimate the security level. In brief, certify that a chip leaks (or not) sensitive information represents a hard if not impossible task.}, note = {DOI: 10.1007/s13389-015-0106-1}, keywords = {}, pubstate = {published}, tppubtype = {article} } The profiled attacks challenge the security of cryptographic devices in the worst case scenario. We elucidate the reasons underlying the success of different profiled attacks (that depend essentially on the context) based on the well-known bias–variance tradeoff developed in the machine learning field. Note that our approach can easily be extended to non-profiled attacks. We show (1) how to decompose (in three additive components) the error rate of an attack based on the bias–variance decomposition, and (2) how to reduce the error rate of a model based on the bias–variance diagnostic. Intuitively, we show that different models having the same error rate require different strategies (according to the bias–variance decomposition) to reduce their errors. More precisely, the success rate of a strategy depends on several criteria such as its complexity, the leakage information and the number of points per trace. As a result, a suboptimal strategy in a specific context can lead the adversary to overestimate the security level of the cryptographic device. Our results also bring warnings related to the estimation of the success rate of a profiled attack that can lead the evaluator to underestimate the security level. In brief, certify that a chip leaks (or not) sensitive information represents a hard if not impossible task. |
Faust, Karoline; Mendez, Gipsi Lima; Lerat, Jean-Sébastien; Sathirapongsasuti, Jarupon Fah; Knight, Rob; Huttenhower, Curtis; Lenaerts, Tom; Raes, Jeroen JR Cross-biome comparison of microbial association networks Journal Article In: Frontiers in microbiology, 6 (OCT), 2015, (DOI: 10.3389/fmicb.2015.01200). @article{info:hdl:2013/226810, title = {Cross-biome comparison of microbial association networks}, author = {Karoline Faust and Gipsi Lima Mendez and Jean-Sébastien Lerat and Jarupon Fah Sathirapongsasuti and Rob Knight and Curtis Huttenhower and Tom Lenaerts and Jeroen JR Raes}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/226810/1/PMC4621437.pdf}, year = {2015}, date = {2015-01-01}, journal = {Frontiers in microbiology}, volume = {6}, number = {OCT}, abstract = {Clinical and environmental meta-omics studies are accumulating an ever-growing amount of microbial abundance data over a wide range of ecosystems. With a sufficiently large sample number, these microbial communities can be explored by constructing and analyzing co-occurrence networks, which detect taxon associations from abundance data and can give insights into community structure. Here, we investigate how co-occurrence networks differ across biomes and which other factors influence their properties. For this, we inferred microbial association networks from 20 different 16S rDNA sequencing data sets and observed that soil microbial networks harbor proportionally fewer positive associations and are less densely interconnected than host-associated networks. After excluding sample number, sequencing depth and beta-diversity as possible drivers, we found a negative correlation between community evenness and positive edge percentage. This correlation likely results from a skewed distribution of negative interactions, which take place preferentially between less prevalent taxa. Overall, our results suggest an under-appreciated role of evenness in shaping microbial association networks.}, note = {DOI: 10.3389/fmicb.2015.01200}, keywords = {}, pubstate = {published}, tppubtype = {article} } Clinical and environmental meta-omics studies are accumulating an ever-growing amount of microbial abundance data over a wide range of ecosystems. With a sufficiently large sample number, these microbial communities can be explored by constructing and analyzing co-occurrence networks, which detect taxon associations from abundance data and can give insights into community structure. Here, we investigate how co-occurrence networks differ across biomes and which other factors influence their properties. For this, we inferred microbial association networks from 20 different 16S rDNA sequencing data sets and observed that soil microbial networks harbor proportionally fewer positive associations and are less densely interconnected than host-associated networks. After excluding sample number, sequencing depth and beta-diversity as possible drivers, we found a negative correlation between community evenness and positive edge percentage. This correlation likely results from a skewed distribution of negative interactions, which take place preferentially between less prevalent taxa. Overall, our results suggest an under-appreciated role of evenness in shaping microbial association networks. |
Lerman, Liran; Poussier, Romain; Bontempi, Gianluca; Markowitch, Olivier; cc, Fran Template attacks vs Machine Learning Revisited Journal Article In: Lecture notes in computer science, 9064 , pp. 20-33, 2015, (Language of publication: en). @article{info:hdl:2013/223764, title = {Template attacks vs Machine Learning Revisited}, author = {Liran Lerman and Romain Poussier and Gianluca Bontempi and Olivier Markowitch and Fran{cc}ois-Xavier Standaert}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/223764}, year = {2015}, date = {2015-01-01}, journal = {Lecture notes in computer science}, volume = {9064}, pages = {20-33}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Zisis, Ioannis; Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom Generosity motivated by acceptance - evolutionary analysis of an anticipation game. Journal Article In: Scientific reports, 5 , pp. 18076, 2015, (DOI: 10.1038/srep18076). @article{info:hdl:2013/227987, title = {Generosity motivated by acceptance - evolutionary analysis of an anticipation game.}, author = {Ioannis Zisis and Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/227987/1/PMC4677303.pdf}, year = {2015}, date = {2015-01-01}, journal = {Scientific reports}, volume = {5}, pages = {18076}, abstract = {We here present both experimental and theoretical results for an Anticipation Game, a two-stage game wherein the standard Dictator Game is played after a matching phase wherein receivers use the past actions of dictators to decide whether to interact with them. The experimental results for three different treatments show that partner choice induces dictators to adjust their donations towards the expectations of the receivers, giving significantly more than expected in the standard Dictator Game. Adding noise to the dictators' reputation lowers the donations, underlining that their actions are determined by the knowledge provided to receivers. Secondly, we show that the recently proposed stochastic evolutionary model where payoff only weakly drives evolution and individuals can make mistakes requires some adaptations to explain the experimental results. We observe that the model fails in reproducing the heterogeneous strategy distributions. We show here that by explicitly modelling the dictators' probability of acceptance by receivers and introducing a parameter that reflects the dictators' capacity to anticipate future gains produces a closer fit to the aforementioned strategy distributions. This new parameter has the important advantage that it explains where the dictators' generosity comes from, revealing that anticipating future acceptance is the key to success.}, note = {DOI: 10.1038/srep18076}, keywords = {}, pubstate = {published}, tppubtype = {article} } We here present both experimental and theoretical results for an Anticipation Game, a two-stage game wherein the standard Dictator Game is played after a matching phase wherein receivers use the past actions of dictators to decide whether to interact with them. The experimental results for three different treatments show that partner choice induces dictators to adjust their donations towards the expectations of the receivers, giving significantly more than expected in the standard Dictator Game. Adding noise to the dictators' reputation lowers the donations, underlining that their actions are determined by the knowledge provided to receivers. Secondly, we show that the recently proposed stochastic evolutionary model where payoff only weakly drives evolution and individuals can make mistakes requires some adaptations to explain the experimental results. We observe that the model fails in reproducing the heterogeneous strategy distributions. We show here that by explicitly modelling the dictators' probability of acceptance by receivers and introducing a parameter that reflects the dictators' capacity to anticipate future gains produces a closer fit to the aforementioned strategy distributions. This new parameter has the important advantage that it explains where the dictators' generosity comes from, revealing that anticipating future acceptance is the key to success. |
Lerman, Liran; Bontempi, Gianluca; Markowitch, Olivier A machine learning approach against a masked AES: Reaching the limit of side-channel attacks with a learning model Journal Article In: Journal of Cryptographic Engineering, 5 (2), pp. 123-139, 2015, (DOI: 10.1007/s13389-014-0089-3). @article{info:hdl:2013/205371, title = {A machine learning approach against a masked AES: Reaching the limit of side-channel attacks with a learning model}, author = {Liran Lerman and Gianluca Bontempi and Olivier Markowitch}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/205371}, year = {2015}, date = {2015-01-01}, journal = {Journal of Cryptographic Engineering}, volume = {5}, number = {2}, pages = {123-139}, abstract = {Side-channel attacks challenge the security of cryptographic devices. A widespread countermeasure against these attacks is the masking approach. Masking combines sensitive variables with secret random values to reduce its leakage. In 2012, Nassar et al. (DATE, pp 1173–1178. IEEE, 2012) presented a new lightweight (low-cost) boolean masking countermeasure to protect the implementation of the Advanced Encryption Standard (AES) block-cipher. This masking scheme represents the target algorithm of the DPAContest V4 (http://www.dpacontest.org/home/, 2013). In this paper, we present the first machine learning attack against a specific masking countermeasure (more precisely the low-entropy boolean masking countermeasure of Nassar et al.), using the dataset of the DPAContest V4. We succeeded to extract each targeted byte of the key of the masked AES with 7.8 traces during the attacking phase with a strategy based solely on machine learning models. Finally, we compared our proposal with (1) a stochastic attack, (2) a strategy based on template attack and (3) a multivariate regression attack. We show that an attack based on a machine learning model reduces significantly the number of traces required during the attacking step compared to these profiling attacks when analyzing the same leakage information.}, note = {DOI: 10.1007/s13389-014-0089-3}, keywords = {}, pubstate = {published}, tppubtype = {article} } Side-channel attacks challenge the security of cryptographic devices. A widespread countermeasure against these attacks is the masking approach. Masking combines sensitive variables with secret random values to reduce its leakage. In 2012, Nassar et al. (DATE, pp 1173–1178. IEEE, 2012) presented a new lightweight (low-cost) boolean masking countermeasure to protect the implementation of the Advanced Encryption Standard (AES) block-cipher. This masking scheme represents the target algorithm of the DPAContest V4 (http://www.dpacontest.org/home/, 2013). In this paper, we present the first machine learning attack against a specific masking countermeasure (more precisely the low-entropy boolean masking countermeasure of Nassar et al.), using the dataset of the DPAContest V4. We succeeded to extract each targeted byte of the key of the masked AES with 7.8 traces during the attacking phase with a strategy based solely on machine learning models. Finally, we compared our proposal with (1) a stochastic attack, (2) a strategy based on template attack and (3) a multivariate regression attack. We show that an attack based on a machine learning model reduces significantly the number of traces required during the attacking step compared to these profiling attacks when analyzing the same leakage information. |
Olsen, Catharina; Fleming, Kathleen; Prendergast, Niall; Rubio, Renee; Emmert-Streib, Frank; Bontempi, Gianluca; Quackenbush, John; Haibe-Kains, Benjamin Using shRNA experiments to validate gene regulatory networks Journal Article In: Genomics Data, 4 , pp. 123-126, 2015, (DOI: 10.1016/j.gdata.2015.03.011). @article{info:hdl:2013/205439, title = {Using shRNA experiments to validate gene regulatory networks}, author = {Catharina Olsen and Kathleen Fleming and Niall Prendergast and Renee Rubio and Frank Emmert-Streib and Gianluca Bontempi and John Quackenbush and Benjamin Haibe-Kains}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/205439/1/Elsevier_189066.pdf}, year = {2015}, date = {2015-01-01}, journal = {Genomics Data}, volume = {4}, pages = {123-126}, abstract = {Quantitative validation of gene regulatory networks (GRNs) inferred from observational expression data is a difficult task usually involving time intensive and costly laboratory experiments. We were able to show that gene knock-down experiments can be used to quantitatively assess the quality of large-scale GRNs via a purely data-driven approach (Olsen et al. 2014). Our new validation framework also enables the statistical comparison of multiple network inference techniques, which was a long-standing challenge in the field.In this Data in Brief we detail the contents and quality controls for the gene expression data (available from NCBI Gene Expression Omnibus repository with accession number GSE53091) associated with our study published in Genomics (Olsen et al. 2014). We also provide R code to access the data and reproduce the analysis presented in this article.}, note = {DOI: 10.1016/j.gdata.2015.03.011}, keywords = {}, pubstate = {published}, tppubtype = {article} } Quantitative validation of gene regulatory networks (GRNs) inferred from observational expression data is a difficult task usually involving time intensive and costly laboratory experiments. We were able to show that gene knock-down experiments can be used to quantitatively assess the quality of large-scale GRNs via a purely data-driven approach (Olsen et al. 2014). Our new validation framework also enables the statistical comparison of multiple network inference techniques, which was a long-standing challenge in the field.In this Data in Brief we detail the contents and quality controls for the gene expression data (available from NCBI Gene Expression Omnibus repository with accession number GSE53091) associated with our study published in Genomics (Olsen et al. 2014). We also provide R code to access the data and reproduce the analysis presented in this article. |
Mendez, Gipsi Lima; Faust, Karoline; Henry, Lynn N; Decelle, Johan; Colin, Simon; Carcillo, Fabrizio; Chaffron, Samuel; Ignacio-Espinosa, Cesar JC; Roux, Simon; Vincent, Flora; Bittner, Lucie; Darzi, Youssef; Wang, Jun; Audic, Stéphane; Berline, Léo; Bontempi, Gianluca; Cabello, Ana AM; Coppola, Laurent; Cornejo-Castillo, Francisco FM; d'Ovidio, Francesco; Meester, Luc De; Ferrera, Isabel; Garet-Delmas, Marie-José; Guidi, Lionel; Lara, Elena; Pesant, Stéphane; Royo-Llonch, Marta; Salazar, Guillem; Sanchez, Paloma; Sebastian, Marta; Souffreau, Caroline; Dimier, Céline; Picheral, Marc; Searson, Sarah; Kandels-Lewis, Stefanie; coordinators, Tara Oceans; Gorsky, Gabriel; Not, Fabrice; Ogata, Hiroyuki; Speich, Sabrina; Stemmann, Lars; Weissenbach, Jean; Wincker, Patrick; Acinas, Silvia SG; Sunagawa, Shinichi; Bork, Peer; Sullivan, Matthew B; Karsenti, Eric; Bowler, Chris; de Vargas, Colomban; Raes, Jeroen JR Ocean plankton. Determinants of community structure in the global plankton interactome. Journal Article In: Science, 348 (6237), pp. 1262073, 2015, (DOI: 10.1126/science.1262073). @article{info:hdl:2013/200950, title = {Ocean plankton. Determinants of community structure in the global plankton interactome.}, author = {Gipsi Lima Mendez and Karoline Faust and Lynn N Henry and Johan Decelle and Simon Colin and Fabrizio Carcillo and Samuel Chaffron and Cesar JC Ignacio-Espinosa and Simon Roux and Flora Vincent and Lucie Bittner and Youssef Darzi and Jun Wang and Stéphane Audic and Léo Berline and Gianluca Bontempi and Ana AM Cabello and Laurent Coppola and Francisco FM Cornejo-Castillo and Francesco d'Ovidio and Luc De Meester and Isabel Ferrera and Marie-José Garet-Delmas and Lionel Guidi and Elena Lara and Stéphane Pesant and Marta Royo-Llonch and Guillem Salazar and Paloma Sanchez and Marta Sebastian and Caroline Souffreau and Céline Dimier and Marc Picheral and Sarah Searson and Stefanie Kandels-Lewis and Tara Oceans coordinators and Gabriel Gorsky and Fabrice Not and Hiroyuki Ogata and Sabrina Speich and Lars Stemmann and Jean Weissenbach and Patrick Wincker and Silvia SG Acinas and Shinichi Sunagawa and Peer Bork and Matthew B Sullivan and Eric Karsenti and Chris Bowler and Colomban de Vargas and Jeroen JR Raes}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/200950}, year = {2015}, date = {2015-01-01}, journal = {Science}, volume = {348}, number = {6237}, pages = {1262073}, abstract = {Species interaction networks are shaped by abiotic and biotic factors. Here, as part of the Tara Oceans project, we studied the photic zone interactome using environmental factors and organismal abundance profiles and found that environmental factors are incomplete predictors of community structure. We found associations across plankton functional types and phylogenetic groups to be nonrandomly distributed on the network and driven by both local and global patterns. We identified interactions among grazers, primary producers, viruses, and (mainly parasitic) symbionts and validated network-generated hypotheses using microscopy to confirm symbiotic relationships. We have thus provided a resource to support further research on ocean food webs and integrating biological components into ocean models.}, note = {DOI: 10.1126/science.1262073}, keywords = {}, pubstate = {published}, tppubtype = {article} } Species interaction networks are shaped by abiotic and biotic factors. Here, as part of the Tara Oceans project, we studied the photic zone interactome using environmental factors and organismal abundance profiles and found that environmental factors are incomplete predictors of community structure. We found associations across plankton functional types and phylogenetic groups to be nonrandomly distributed on the network and driven by both local and global patterns. We identified interactions among grazers, primary producers, viruses, and (mainly parasitic) symbionts and validated network-generated hypotheses using microscopy to confirm symbiotic relationships. We have thus provided a resource to support further research on ocean food webs and integrating biological components into ocean models. |
Colaprico, Antonio; Cava, Claudia; Bertoli, Gloria; Bontempi, Gianluca; Castiglioni, Isabella Integrative Analysis with Monte Carlo Cross-Validation Reveals miRNAs Regulating Pathways Cross-Talk in Aggressive Breast Cancer Journal Article In: BioMed Research International, 2015 , 2015, (DOI: 10.1155/2015/831314). @article{info:hdl:2013/217546, title = {Integrative Analysis with Monte Carlo Cross-Validation Reveals miRNAs Regulating Pathways Cross-Talk in Aggressive Breast Cancer}, author = {Antonio Colaprico and Claudia Cava and Gloria Bertoli and Gianluca Bontempi and Isabella Castiglioni}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/217546}, year = {2015}, date = {2015-01-01}, journal = {BioMed Research International}, volume = {2015}, abstract = {In this work an integrated approach was used to identify functional miRNAs regulating gene pathway cross-talk in breast cancer (BC). We first integrated gene expression profiles and biological pathway information to explore the underlying associations between genes differently expressed among normal and BC samples and pathways enriched from these genes. For each pair of pathways, a score was derived from the distribution of gene expression levels by quantifying their pathway cross-talk. Random forest classification allowed the identification of pairs of pathways with high cross-talk. We assessed miRNAs regulating the identified gene pathways by a mutual information analysis. A Fisher test was applied to demonstrate their significance in the regulated pathways. Our results suggest interesting networks of pathways that could be key regulatory of target genes in BC, including stem cell pluripotency, coagulation, and hypoxia pathways and miRNAs that control these networks could be potential biomarkers for diagnostic, prognostic, and therapeutic development in BC. This work shows that standard methods of predicting normal and tumor classes such as differentially expressed miRNAs or transcription factors could lose intrinsic features; instead our approach revealed the responsible molecules of the disease.}, note = {DOI: 10.1155/2015/831314}, keywords = {}, pubstate = {published}, tppubtype = {article} } In this work an integrated approach was used to identify functional miRNAs regulating gene pathway cross-talk in breast cancer (BC). We first integrated gene expression profiles and biological pathway information to explore the underlying associations between genes differently expressed among normal and BC samples and pathways enriched from these genes. For each pair of pathways, a score was derived from the distribution of gene expression levels by quantifying their pathway cross-talk. Random forest classification allowed the identification of pairs of pathways with high cross-talk. We assessed miRNAs regulating the identified gene pathways by a mutual information analysis. A Fisher test was applied to demonstrate their significance in the regulated pathways. Our results suggest interesting networks of pathways that could be key regulatory of target genes in BC, including stem cell pluripotency, coagulation, and hypoxia pathways and miRNAs that control these networks could be potential biomarkers for diagnostic, prognostic, and therapeutic development in BC. This work shows that standard methods of predicting normal and tumor classes such as differentially expressed miRNAs or transcription factors could lose intrinsic features; instead our approach revealed the responsible molecules of the disease. |
Bontempi, Gianluca; Flauder, Maxime From dependency to causality: a machine learning approach Journal Article In: Journal of machine learning research, 2015, (Language of publication: en). @article{info:hdl:2013/222900, title = {From dependency to causality: a machine learning approach}, author = {Gianluca Bontempi and Maxime Flauder}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/222900}, year = {2015}, date = {2015-01-01}, journal = {Journal of machine learning research}, abstract = {The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference. Recent results in the ChaLearn cause-effect pair challenge have shown that causal directionality can be inferred with good accuracy also in Markov indistinguishable configurations thanks to data driven approaches. This paper proposes a supervised machine learning approach to infer the existence of a directed causal link between two variables in multivariate settings with n > 2 variables. The approach relies on the asymmetry of some conditional (in)dependence relations between the members of the Markov blankets of two variables causally connected. Our results show that supervised learning methods may be successfully used to extract causal information on the basis of asymmetric statistical descriptors also for n > 2 variate distributions.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference. Recent results in the ChaLearn cause-effect pair challenge have shown that causal directionality can be inferred with good accuracy also in Markov indistinguishable configurations thanks to data driven approaches. This paper proposes a supervised machine learning approach to infer the existence of a directed causal link between two variables in multivariate settings with n > 2 variables. The approach relies on the asymmetry of some conditional (in)dependence relations between the members of the Markov blankets of two variables causally connected. Our results show that supervised learning methods may be successfully used to extract causal information on the basis of asymmetric statistical descriptors also for n > 2 variate distributions. |
Lerman, Liran; Poussier, Romain; Bontempi, Gianluca; Markowitch, Olivier; cc, Fran Template attacks vs. Machine learning revisited (and the curse of dimensionality in side-channel analysis) Journal Article In: Lecture notes in computer science, 9064 , pp. 20-33, 2015, (DOI: 10.1007/978-3-319-21476-4_2). @article{info:hdl:2013/226800, title = {Template attacks vs. Machine learning revisited (and the curse of dimensionality in side-channel analysis)}, author = {Liran Lerman and Romain Poussier and Gianluca Bontempi and Olivier Markowitch and Fran{cc}ois-Xavier Standaert}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/226800}, year = {2015}, date = {2015-01-01}, journal = {Lecture notes in computer science}, volume = {9064}, pages = {20-33}, abstract = {Template attacks and machine learning are two popular approaches to profiled side-channel analysis. In this paper, we aim to contribute to the understanding of their respective strengths and weaknesses, with a particular focus on their curse of dimensionality. For this purpose, we take advantage of a well-controlled simulated experimental setting in order to put forward two important intuitions. First and from a theoretical point of view, the data complexity of template attacks is not sensitive to the dimension increase in side-channel traces given that their profiling is perfect. Second and from a practical point of view, concrete attacks are always affected by (estimation and assumption) errors during profiling. As these errors increase, machine learning gains interest compared to template attacks, especially when based on random forests.}, note = {DOI: 10.1007/978-3-319-21476-4_2}, keywords = {}, pubstate = {published}, tppubtype = {article} } Template attacks and machine learning are two popular approaches to profiled side-channel analysis. In this paper, we aim to contribute to the understanding of their respective strengths and weaknesses, with a particular focus on their curse of dimensionality. For this purpose, we take advantage of a well-controlled simulated experimental setting in order to put forward two important intuitions. First and from a theoretical point of view, the data complexity of template attacks is not sensitive to the dimension increase in side-channel traces given that their profiling is perfect. Second and from a practical point of view, concrete attacks are always affected by (estimation and assumption) errors during profiling. As these errors increase, machine learning gains interest compared to template attacks, especially when based on random forests. |
Pozzolo, Andrea Dal; Caelen, Olivier; Bontempi, Gianluca When is undersampling effective in unbalanced classification tasks? Journal Article In: Lecture notes in computer science, 9284 , pp. 200-215, 2015, (DOI: 10.1007/978-3-319-23528-8_13). @article{info:hdl:2013/237086, title = {When is undersampling effective in unbalanced classification tasks?}, author = {Andrea Dal Pozzolo and Olivier Caelen and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/237086}, year = {2015}, date = {2015-01-01}, journal = {Lecture notes in computer science}, volume = {9284}, pages = {200-215}, abstract = {A well-known rule of thumb in unbalanced classification recommends the rebalancing (typically by resampling) of the classes before proceeding with the learning of the classifier. Though this seems to work for the majority of cases, no detailed analysis exists about the impact of undersampling on the accuracy of the final classifier. This paper aims to fill this gap by proposing an integrated analysis of the two elements which have the largest impact on the effectiveness of an undersampling strategy: the increase of the variance due to the reduction of the number of samples and the warping of the posterior distribution due to the change of priori probabilities. In particular we will propose a theoretical analysis specifying under which conditions undersampling is recommended and expected to be effective. It emerges that the impact of undersampling depends on the number of samples, the variance of the classifier, the degree of imbalance and more specifically on the value of the posterior probability. This makes difficult to predict the average effectiveness of an undersampling strategy since its benefits depend on the distribution of the testing points. Results from several synthetic and real-world unbalanced datasets support and validate our findings.}, note = {DOI: 10.1007/978-3-319-23528-8_13}, keywords = {}, pubstate = {published}, tppubtype = {article} } A well-known rule of thumb in unbalanced classification recommends the rebalancing (typically by resampling) of the classes before proceeding with the learning of the classifier. Though this seems to work for the majority of cases, no detailed analysis exists about the impact of undersampling on the accuracy of the final classifier. This paper aims to fill this gap by proposing an integrated analysis of the two elements which have the largest impact on the effectiveness of an undersampling strategy: the increase of the variance due to the reduction of the number of samples and the warping of the posterior distribution due to the change of priori probabilities. In particular we will propose a theoretical analysis specifying under which conditions undersampling is recommended and expected to be effective. It emerges that the impact of undersampling depends on the number of samples, the variance of the classifier, the degree of imbalance and more specifically on the value of the posterior probability. This makes difficult to predict the average effectiveness of an undersampling strategy since its benefits depend on the distribution of the testing points. Results from several synthetic and real-world unbalanced datasets support and validate our findings. |
Cuellar, M P M P; Ros, Maria; Bautista, Maria Martin J; "e, Yann-A; Bontempi, Gianluca An approach for the evaluation of human activities in physical therapy scenarios Journal Article In: Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 141 , pp. 401-414, 2015, (DOI: 10.1007/978-3-319-16292-8_29). @article{info:hdl:2013/198531, title = {An approach for the evaluation of human activities in physical therapy scenarios}, author = {M P M P Cuellar and Maria Ros and Maria Martin J Bautista and Yann-A{"e}l Le Borgne and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/198531}, year = {2015}, date = {2015-01-01}, journal = {Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering}, volume = {141}, pages = {401-414}, abstract = {Human activity recognition has been widely studied since the last decade in ambient intelligence scenarios. Remarkable progresses have been made in this domain, especially in research lines such as ambient assisted living, gesture recognition, behaviour detection and classification, etc. Most of the works in the literature focus on activity classification or recognition, prediction of future events, or anomaly detection and prevention. However, it is hard to find approaches that do not only recognize an activity, but also provide an evaluation of its performance according to an optimality criterion. This problem is of special interest in applications such as sports performance evaluation, physical therapy, etc. In this work, we address the problem of the evaluation of such human activities in monitored environments using depth sensors. In particular, we propose a system able to provide an automatic evaluation of the correctness in the performance of activities involving motion, and more specifically, diagnosis exercises in physical therapy.}, note = {DOI: 10.1007/978-3-319-16292-8_29}, keywords = {}, pubstate = {published}, tppubtype = {article} } Human activity recognition has been widely studied since the last decade in ambient intelligence scenarios. Remarkable progresses have been made in this domain, especially in research lines such as ambient assisted living, gesture recognition, behaviour detection and classification, etc. Most of the works in the literature focus on activity classification or recognition, prediction of future events, or anomaly detection and prevention. However, it is hard to find approaches that do not only recognize an activity, but also provide an evaluation of its performance according to an optimality criterion. This problem is of special interest in applications such as sports performance evaluation, physical therapy, etc. In this work, we address the problem of the evaluation of such human activities in monitored environments using depth sensors. In particular, we propose a system able to provide an automatic evaluation of the correctness in the performance of activities involving motion, and more specifically, diagnosis exercises in physical therapy. |
Pozzolo, Andrea Dal; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information Inproceedings In: Neural Networks (IJCNN), 2015 International Joint Conference on, 2015, (Language of publication: en). @inproceedings{info:hdl:2013/221668, title = {Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information}, author = {Andrea Dal Pozzolo and Giacomo Boracchi and Olivier Caelen and Cesare Alippi and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/221668}, year = {2015}, date = {2015-01-01}, booktitle = {Neural Networks (IJCNN), 2015 International Joint Conference on}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Pozzolo, Andrea Dal; Caelen, Olivier; Johnson, Reid; Bontempi, Gianluca Calibrating Probability with Undersampling for Unbalanced Classification Inproceedings In: 2015 IEEE Symposium on Computational Intelligence and Data Mining, 2015, (Language of publication: en). @inproceedings{info:hdl:2013/221670, title = {Calibrating Probability with Undersampling for Unbalanced Classification}, author = {Andrea Dal Pozzolo and Olivier Caelen and Reid Johnson and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/221670}, year = {2015}, date = {2015-01-01}, booktitle = {2015 IEEE Symposium on Computational Intelligence and Data Mining}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Zisis, Ioannis; Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom Receivers’ acceptance drives the generosity of dictators in a dictator game with partner selection Miscellaneous 2015, (Conference: Student Conference on Complexity Science(9-11 September 2015: Granada, Spain)). @misc{info:hdl:2013/243680, title = {Receivers’ acceptance drives the generosity of dictators in a dictator game with partner selection}, author = {Ioannis Zisis and Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243680}, year = {2015}, date = {2015-01-01}, note = {Conference: Student Conference on Complexity Science(9-11 September 2015: Granada, Spain)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Zisis, Ioannis; Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom An experimental and evolutionary analysis of group formation and gift giving Miscellaneous 2015, (Conference: Conference on Complex Systems(28 September - 2 October 2015: Tempe, USA)). @misc{info:hdl:2013/243679, title = {An experimental and evolutionary analysis of group formation and gift giving}, author = {Ioannis Zisis and Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243679}, year = {2015}, date = {2015-01-01}, note = {Conference: Conference on Complex Systems(28 September - 2 October 2015: Tempe, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Zisis, Ioannis; Guida, Sibilla Di; Han, The Anh T A H; Kirchsteiger, Georg; Lenaerts, Tom Receivers’ acceptance drives the generosity of dictators in a dictator game with partner selection Miscellaneous 2015, (Conference: Physics Meets the Social Science, Granada Seminar on Computational and Statistical Physics(15-19 june 2015: La Herradura, Spain)). @misc{info:hdl:2013/243682, title = {Receivers’ acceptance drives the generosity of dictators in a dictator game with partner selection}, author = {Ioannis Zisis and Sibilla Di Guida and The Anh T A H Han and Georg Kirchsteiger and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243682}, year = {2015}, date = {2015-01-01}, note = {Conference: Physics Meets the Social Science, Granada Seminar on Computational and Statistical Physics(15-19 june 2015: La Herradura, Spain)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom Emergence of revenge and forgiveness in commitments Miscellaneous 2015, (Conference: Physics Meets the Social Science, Granada Seminar on Computational and Statistical Physics(15-19 June 2015: La Herradura, Spain)). @misc{info:hdl:2013/243683, title = {Emergence of revenge and forgiveness in commitments}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243683}, year = {2015}, date = {2015-01-01}, note = {Conference: Physics Meets the Social Science, Granada Seminar on Computational and Statistical Physics(15-19 June 2015: La Herradura, Spain)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Daneels, Dorien; Smits, Guillaume; Dooren, Sonia Van; Cilia, Elisa; Lenaerts, Tom DIDA - A first digenic diseases database Miscellaneous 2015, (Conference: 15th Annual Meeting of the Belgian Society of Human Genetics(6 May 2015: Charleroi, Belgium)). @misc{info:hdl:2013/243685, title = {DIDA - A first digenic diseases database}, author = {Andrea Gazzo and Dorien Daneels and Guillaume Smits and Sonia Van Dooren and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243685}, year = {2015}, date = {2015-01-01}, note = {Conference: 15th Annual Meeting of the Belgian Society of Human Genetics(6 May 2015: Charleroi, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Conard, Ashley M; Cilia, Elisa; Lenaerts, Tom 2015, (Conference: The 23rd Annual International Conference on Intelligent Systems for Molecular Biology and the 14th European Conference on Computational Biology(10-14 July 2015: Dublin, ireland)). @misc{info:hdl:2013/243689, title = {Determining the winning SH3 coalition: how cooperative game theory reveals the importance of domain residues in peptide binding}, author = {Ashley M Conard and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243689}, year = {2015}, date = {2015-01-01}, note = {Conference: The 23rd Annual International Conference on Intelligent Systems for Molecular Biology and the 14th European Conference on Computational Biology(10-14 July 2015: Dublin, ireland)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Daneels, Dorien; Bonduelle, Maryse; Dooren, Sonia Van; Smits, Guillaume; Lenaerts, Tom Predicting oligogenic effects using digenic disease data Miscellaneous 2015, (Conference: 10th BeNeLux Bioinformatics Conference(7-8 December 2015: Antwerp, Belgium)). @misc{info:hdl:2013/243688, title = {Predicting oligogenic effects using digenic disease data}, author = {Andrea Gazzo and Dorien Daneels and Maryse Bonduelle and Sonia Van Dooren and Guillaume Smits and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243688}, year = {2015}, date = {2015-01-01}, note = {Conference: 10th BeNeLux Bioinformatics Conference(7-8 December 2015: Antwerp, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Daneels, Dorien; Gazzo, Andrea; Bonduelle, Maryse; Smits, Guillaume; Cilia, Elisa; Lenaerts, Tom; Dooren, Sonia Van DIDA: a first database on digenic diseases Miscellaneous 2015, (Conference: VUB DS-LSM PhD day(31 March 2015: Jette, Belgium)). @misc{info:hdl:2013/243686, title = {DIDA: a first database on digenic diseases}, author = {Dorien Daneels and Andrea Gazzo and Maryse Bonduelle and Guillaume Smits and Elisa Cilia and Tom Lenaerts and Sonia Van Dooren}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243686}, year = {2015}, date = {2015-01-01}, note = {Conference: VUB DS-LSM PhD day(31 March 2015: Jette, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Daneels, Dorien; Smits, Guillaume; Dooren, Sonia Van; Cilia, Elisa; Lenaerts, Tom DIDA - A first digenic diseases database Miscellaneous 2015, (Conference: Cold Spring Harbor Laboratory conference on Genome Informatics(28-31 October 2015: Cold Spring Harbor, USA)). @misc{info:hdl:2013/243687, title = {DIDA - A first digenic diseases database}, author = {Andrea Gazzo and Dorien Daneels and Guillaume Smits and Sonia Van Dooren and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243687}, year = {2015}, date = {2015-01-01}, note = {Conference: Cold Spring Harbor Laboratory conference on Genome Informatics(28-31 October 2015: Cold Spring Harbor, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Huculeci, Radu; Cilia, Elisa; Buts, Lieven; Houben, Klaartje; van Nuland, Nico A J; Lenaerts, Tom SH2 sidechain dynamics may be key to unlock kinase activity Miscellaneous 2015, (Conference: Keystone symposium - The Biological Code of Cell Signaling: A Tribute to Tony Pawson(11-15 january 2015: Steamboat springs, Colorado, USA)). @misc{info:hdl:2013/243692, title = {SH2 sidechain dynamics may be key to unlock kinase activity}, author = {Radu Huculeci and Elisa Cilia and Lieven Buts and Klaartje Houben and Nico A J van Nuland and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243692}, year = {2015}, date = {2015-01-01}, note = {Conference: Keystone symposium - The Biological Code of Cell Signaling: A Tribute to Tony Pawson(11-15 january 2015: Steamboat springs, Colorado, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Ruano, Ana Zafra; Cilia, Elisa; Couceiro, José JR; Sanz, Javier Ruiz; Schymkowitz, Joost J; Rousseau, Frédéric; Luque, Irene; Lenaerts, Tom On the evolutionary conservation of the dynamic changes predicted for a collection of SH3 domain structures Miscellaneous 2015, (Conference: Keystone symposium - The Biological Code of Cell Signaling: A Tribute to Tony Pawson(11-16 January 2015: Steamboat springs, Colorado, USA)). @misc{info:hdl:2013/243694, title = {On the evolutionary conservation of the dynamic changes predicted for a collection of SH3 domain structures}, author = {Ana Zafra Ruano and Elisa Cilia and José JR Couceiro and Javier Ruiz Sanz and Joost J Schymkowitz and Frédéric Rousseau and Irene Luque and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243694}, year = {2015}, date = {2015-01-01}, note = {Conference: Keystone symposium - The Biological Code of Cell Signaling: A Tribute to Tony Pawson(11-16 January 2015: Steamboat springs, Colorado, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Lenaerts, Tom Bits and bytes of protein domain communication Miscellaneous 2015, (Conference: International Conference on Emergence in Chemical Systems(21-27 june 2015: Anchorage, USA)). @misc{info:hdl:2013/243701, title = {Bits and bytes of protein domain communication}, author = {Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243701}, year = {2015}, date = {2015-01-01}, note = {Conference: International Conference on Emergence in Chemical Systems(21-27 june 2015: Anchorage, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Lenaerts, Tom 2015, (Conference: French Symposium on Games(26-30 May 2015: Paris, France)). @misc{info:hdl:2013/243706, title = {To apologize or not to apologize; the evolutionary viability of forgiveness in interrupted commitments in the iterated prisoners dilemma}, author = {Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243706}, year = {2015}, date = {2015-01-01}, note = {Conference: French Symposium on Games(26-30 May 2015: Paris, France)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Lenaerts, Tom Commitment to cooperate - the evolutionary viability of individual and group commitments in social dilemmas Miscellaneous 2015, (Conference: Granada Seminar on Computational and Statistical Physics(15-19 june 2015: La Herradura, Spain)). @misc{info:hdl:2013/243704, title = {Commitment to cooperate - the evolutionary viability of individual and group commitments in social dilemmas}, author = {Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243704}, year = {2015}, date = {2015-01-01}, note = {Conference: Granada Seminar on Computational and Statistical Physics(15-19 june 2015: La Herradura, Spain)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Hoffman, Anthony; Lenaerts, Tom Live migration of virtual machines in the cloud computing. Informatique Masters Thesis 2015, (Language of publication: fr). @mastersthesis{info:hdl:2013/243958, title = {Live migration of virtual machines in the cloud computing. Informatique}, author = {Anthony Hoffman and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243958}, year = {2015}, date = {2015-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Pozzolo, Andrea Dal Adaptive Machine Learning for Credit Card Fraud Detection PhD Thesis 2015, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/221654, title = {Adaptive Machine Learning for Credit Card Fraud Detection}, author = {Andrea Dal Pozzolo}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/221654/5/contratDalPozzolo.pdf}, year = {2015}, date = {2015-01-01}, abstract = {Billions of dollars of loss are caused every year by fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to the non-stationary distribution of the data, the highly unbalanced classes distributions and the availability of few transactions labeled by fraud investigators. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about what is the best strategy. In this thesis we aim to provide some answers by focusing on crucial issues such as: i) why and how undersampling is useful in the presence of class imbalance (i.e. frauds are a small percentage of the transactions), ii) how to deal with unbalanced and evolving data streams (non-stationarity due to fraud evolution and change of spending behavior), iii) how to assess performances in a way which is relevant for detection and iv) how to use feedbacks provided by investigators on the fraud alerts generated. Finally, we design and assess a prototype of a Fraud Detection System able to meet real-world working conditions and that is able to integrate investigators’ feedback to generate accurate alerts.}, note = {Funder: Universite Libre de Bruxelles}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } Billions of dollars of loss are caused every year by fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to the non-stationary distribution of the data, the highly unbalanced classes distributions and the availability of few transactions labeled by fraud investigators. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about what is the best strategy. In this thesis we aim to provide some answers by focusing on crucial issues such as: i) why and how undersampling is useful in the presence of class imbalance (i.e. frauds are a small percentage of the transactions), ii) how to deal with unbalanced and evolving data streams (non-stationarity due to fraud evolution and change of spending behavior), iii) how to assess performances in a way which is relevant for detection and iv) how to use feedbacks provided by investigators on the fraud alerts generated. Finally, we design and assess a prototype of a Fraud Detection System able to meet real-world working conditions and that is able to integrate investigators’ feedback to generate accurate alerts. |
2014 |
Lenaerts, Tom; Giacobini, Mario; Bersini, Hugues; Bourgine, Paul; Dorigo, Marco; Doursat, René Special issue for the 20th anniversary of the European conference on artificial life (ECAL 2011) Book 2014, (Language of publication: fr). @book{info:hdl:2013/226112, title = {Special issue for the 20th anniversary of the European conference on artificial life (ECAL 2011)}, author = {Tom Lenaerts and Mario Giacobini and Hugues Bersini and Paul Bourgine and Marco Dorigo and René Doursat}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/226112}, year = {2014}, date = {2014-01-01}, series = {Artificial Life}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {book} } |
Taieb, Souhaib Ben Machine learning strategies for multi-step-ahead time series forecasting PhD Thesis 2014, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209234, title = {Machine learning strategies for multi-step-ahead time series forecasting}, author = {Souhaib Ben Taieb}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209234}, year = {2014}, date = {2014-01-01}, abstract = {How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning. The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms. Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies. In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners. Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest. We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts. Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award. }, How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. <p><p>Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning.<p><p>The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms.<p><p>Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies.<p><p>In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners.<p><p>Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest.<p><p>We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts.<p><p>Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award.<p> |
Taieb, Souhaib Ben Machine learning strategies for multi-step-ahead time series forecasting PhD Thesis 2014, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209234b, title = {Machine learning strategies for multi-step-ahead time series forecasting}, author = {Souhaib Ben Taieb}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209234/4/2c5e8bfe-3eab-4c2a-acb0-843504ddfcbd.txt}, year = {2014}, date = {2014-01-01}, abstract = {How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning. The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms. Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies. In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners. Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest. We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts. Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award. }, How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. <p><p>Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning.<p><p>The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms.<p><p>Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies.<p><p>In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners.<p><p>Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest.<p><p>We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts.<p><p>Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award.<p> |
Cilia, Elisa; Teso, Stefano; Ammendola, Sergio; Lenaerts, Tom; Passerini, Andrea Predicting virus mutations through statistical relational learning. Journal Article In: BMC bioinformatics, 15 , pp. 309, 2014, (DOI: 10.1186/1471-2105-15-309). @article{info:hdl:2013/186411, title = {Predicting virus mutations through statistical relational learning.}, author = {Elisa Cilia and Stefano Teso and Sergio Ammendola and Tom Lenaerts and Andrea Passerini}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/186411/1/PMC4261881.pdf}, year = {2014}, date = {2014-01-01}, journal = {BMC bioinformatics}, volume = {15}, pages = {309}, abstract = {Background: Viruses are typically characterized by high mutation rates, which allow them to quickly develop drug-resistant mutations. Mining relevant rules from mutation data can be extremely useful to understand the virus adaptation mechanism and to design drugs that effectively counter potentially resistant mutants.Results: We propose a simple statistical relational learning approach for mutant prediction where the input consists of mutation data with drug-resistance information, either as sets of mutations conferring resistance to a certain drug, or as sets of mutants with information on their susceptibility to the drug. The algorithm learns a set of relational rules characterizing drug-resistance and uses them to generate a set of potentially resistant mutants. Learning a weighted combination of rules allows to attach generated mutants with a resistance score as predicted by the statistical relational model and select only the highest scoring ones.Conclusions: Promising results were obtained in generating resistant mutations for both nucleoside and non-nucleoside HIV reverse transcriptase inhibitors. The approach can be generalized quite easily to learning mutants characterized by more complex rules correlating multiple mutations.}, note = {DOI: 10.1186/1471-2105-15-309}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background: Viruses are typically characterized by high mutation rates, which allow them to quickly develop drug-resistant mutations. Mining relevant rules from mutation data can be extremely useful to understand the virus adaptation mechanism and to design drugs that effectively counter potentially resistant mutants.Results: We propose a simple statistical relational learning approach for mutant prediction where the input consists of mutation data with drug-resistance information, either as sets of mutations conferring resistance to a certain drug, or as sets of mutants with information on their susceptibility to the drug. The algorithm learns a set of relational rules characterizing drug-resistance and uses them to generate a set of potentially resistant mutants. Learning a weighted combination of rules allows to attach generated mutants with a resistance score as predicted by the statistical relational model and select only the highest scoring ones.Conclusions: Promising results were obtained in generating resistant mutations for both nucleoside and non-nucleoside HIV reverse transcriptase inhibitors. The approach can be generalized quite easily to learning mutants characterized by more complex rules correlating multiple mutations. |
Rubio, Lucia; Huculeci, Radu; Buts, Lieven; Vanwetswinkel, Sophie; Lenaerts, Tom; van Nuland, Nico A J (1)H, (13)C, and (15)N backbone and side-chain chemical shift assignments of the free and bound forms of the human PTPN11 second SH2 domain. Journal Article In: Biomolecular N M R Assignments, 8 , 2014, (DOI: 10.1007/s12104-013-9504-4). @article{info:hdl:2013/155960, title = {(1)H, (13)C, and (15)N backbone and side-chain chemical shift assignments of the free and bound forms of the human PTPN11 second SH2 domain.}, author = {Lucia Rubio and Radu Huculeci and Lieven Buts and Sophie Vanwetswinkel and Tom Lenaerts and Nico A J van Nuland}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/155960}, year = {2014}, date = {2014-01-01}, journal = {Biomolecular N M R Assignments}, volume = {8}, abstract = {Src homology 2 (SH2) domains have an important role in the regulation of protein activity and intracellular signaling processes. They are geared to bind to specific phosphotyrosine (pY) motifs, with a substrate sequence specificity depending on the three amino acids immediately C-terminal to the pY. Here we report for the first time the (1)H, (15)N and (13)C backbone and side-chain chemical shift assignments for the C-terminal SH2 domain of the human protein tyrosine phosphatase PTPN11, both in its free and bound forms, where the ligand in the latter corresponds to a specific sequence of the human erythropoietin receptor.}, note = {DOI: 10.1007/s12104-013-9504-4}, keywords = {}, pubstate = {published}, tppubtype = {article} } Src homology 2 (SH2) domains have an important role in the regulation of protein activity and intracellular signaling processes. They are geared to bind to specific phosphotyrosine (pY) motifs, with a substrate sequence specificity depending on the three amino acids immediately C-terminal to the pY. Here we report for the first time the (1)H, (15)N and (13)C backbone and side-chain chemical shift assignments for the C-terminal SH2 domain of the human protein tyrosine phosphatase PTPN11, both in its free and bound forms, where the ligand in the latter corresponds to a specific sequence of the human erythropoietin receptor. |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim The DynaMine webserver: predicting protein dynamics from sequence. Journal Article In: Nucleic acids research, 42 , pp. W264-W270, 2014, (DOI: 10.1093/nar/gku270). @article{info:hdl:2013/186420, title = {The DynaMine webserver: predicting protein dynamics from sequence.}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim Vranken}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/186420/1/PMC4086073.pdf}, year = {2014}, date = {2014-01-01}, journal = {Nucleic acids research}, volume = {42}, pages = {W264-W270}, abstract = {Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at http://dynamine.ibsquare.be.}, note = {DOI: 10.1093/nar/gku270}, keywords = {}, pubstate = {published}, tppubtype = {article} } Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at http://dynamine.ibsquare.be. |
Gagliolo, Matteo; Lenaerts, Tom; Jacobs, Dirk Politics Matters: Dynamics of Inter-organizational Networks among Immigrant Associations Journal Article In: Studies in Computational Intelligence, 549 , pp. 47-55, 2014, (DOI: 10.1007/978-3-319-05401-8_5). @article{info:hdl:2013/264842, title = {Politics Matters: Dynamics of Inter-organizational Networks among Immigrant Associations}, author = {Matteo Gagliolo and Tom Lenaerts and Dirk Jacobs}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/264842/3/Gagliolo2014CompleNet.pdf}, year = {2014}, date = {2014-01-01}, journal = {Studies in Computational Intelligence}, volume = {549}, pages = {47-55}, abstract = {We model the dynamics of the two-mode network among directors and boards of voluntary associations, using a stochastic actor-based model, SIENA [12], including the structural effects proposed in [6], and considering the political orientation of associations as a covariate. Using data from [14], we compare the evolution of interlocks among Turkish associations in two European capitals, and explain the noticeable difference in structure by looking at statistically significant differences among the estimated effects.}, note = {DOI: 10.1007/978-3-319-05401-8_5}, keywords = {}, pubstate = {published}, tppubtype = {article} } We model the dynamics of the two-mode network among directors and boards of voluntary associations, using a stochastic actor-based model, SIENA [12], including the structural effects proposed in [6], and considering the political orientation of associations as a covariate. Using data from [14], we compare the evolution of interlocks among Turkish associations in two European capitals, and explain the noticeable difference in structure by looking at statistically significant differences among the estimated effects. |
Lenaerts, Tom; Giacobini, Mario; Bersini, Hugues; Bourgine, Paul; Dorigo, Marco; Doursat, René Special issue for the 20th anniversary of the European conference on artificial life (ECAL 2011): Editorial Journal Article In: Artificial life, 20 (1), pp. 1-3, 2014, (DOI: 10.1162/ARTL-a-00093). @article{info:hdl:2013/204002, title = {Special issue for the 20th anniversary of the European conference on artificial life (ECAL 2011): Editorial}, author = {Tom Lenaerts and Mario Giacobini and Hugues Bersini and Paul Bourgine and Marco Dorigo and René Doursat}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/204002}, year = {2014}, date = {2014-01-01}, journal = {Artificial life}, volume = {20}, number = {1}, pages = {1-3}, note = {DOI: 10.1162/ARTL-a-00093}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Trepo, Eric; Nahon, Pierre; Bontempi, Gianluca; Valenti, Luca; Falleti, Edmondo; Nischalke, Hans Dieter; Hamza, Samia; Corradini, Stefano Ginanni; Burza, Maria Antonella; Guyot, Erwan; Donati, Benedetta; Spengler, Ulrich; Hillon, Patrick; Toniutto, Pierluigi; Henrion, Jean; Franchimont, Denis; `e, Jacques Devi; Mathurin, Philippe; Moreno, Christophe; Romeo, Stefano; Deltenre, Pierre In: Hepatology, 59 (6), pp. 2170-2177, 2014, (DOI: 10.1002/hep.26767). @article{info:hdl:2013/196264, title = {Association between the PNPLA3 (rs738409 C>G) variant and hepatocellular carcinoma: Evidence from a meta-analysis of individual participant data.}, author = {Eric Trepo and Pierre Nahon and Gianluca Bontempi and Luca Valenti and Edmondo Falleti and Hans Dieter Nischalke and Samia Hamza and Stefano Ginanni Corradini and Maria Antonella Burza and Erwan Guyot and Benedetta Donati and Ulrich Spengler and Patrick Hillon and Pierluigi Toniutto and Jean Henrion and Denis Franchimont and Jacques Devi{`e}re and Philippe Mathurin and Christophe Moreno and Stefano Romeo and Pierre Deltenre}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/196264/3/196264.pdf}, year = {2014}, date = {2014-01-01}, journal = {Hepatology}, volume = {59}, number = {6}, pages = {2170-2177}, abstract = {The incidence of hepatocellular carcinoma (HCC) is increasing in Western countries. Although several clinical factors have been identified, many individuals never develop HCC, suggesting a genetic susceptibility. However, to date, only a few single-nucleotide polymorphisms have been reproducibly shown to be linked to HCC onset. A variant (rs738409 C>G, encoding for p.I148M) in the PNPLA3 gene is associated with liver damage in chronic liver diseases. Interestingly, several studies have reported that the minor rs738409[G] allele is more represented in HCC cases in chronic hepatitis C (CHC) and alcoholic liver disease (ALD). However, a significant association with HCC related to CHC has not been consistently observed, and the strength of the association between rs738409 and HCC remains unclear. We performed a meta-analysis of individual participant data including 2,503 European patients with cirrhosis to assess the association between rs738409 and HCC, particularly in ALD and CHC. We found that rs738409 was strongly associated with overall HCC (odds ratio [OR] per G allele, additive model=1.77; 95% confidence interval [CI]: 1.42-2.19; P=2.78 ?x 10(-7) ). This association was more pronounced in ALD (OR=2.20; 95% CI: 1.80-2.67; P=4.71 ?x 10(-15) ) than in CHC patients (OR=1.55; 95% CI: 1.03-2.34; P=3.52 ?x 10(-2) ). After adjustment for age, sex, and body mass index, the variant remained strongly associated with HCC.}, note = {DOI: 10.1002/hep.26767}, keywords = {}, pubstate = {published}, tppubtype = {article} } The incidence of hepatocellular carcinoma (HCC) is increasing in Western countries. Although several clinical factors have been identified, many individuals never develop HCC, suggesting a genetic susceptibility. However, to date, only a few single-nucleotide polymorphisms have been reproducibly shown to be linked to HCC onset. A variant (rs738409 C>G, encoding for p.I148M) in the PNPLA3 gene is associated with liver damage in chronic liver diseases. Interestingly, several studies have reported that the minor rs738409[G] allele is more represented in HCC cases in chronic hepatitis C (CHC) and alcoholic liver disease (ALD). However, a significant association with HCC related to CHC has not been consistently observed, and the strength of the association between rs738409 and HCC remains unclear. We performed a meta-analysis of individual participant data including 2,503 European patients with cirrhosis to assess the association between rs738409 and HCC, particularly in ALD and CHC. We found that rs738409 was strongly associated with overall HCC (odds ratio [OR] per G allele, additive model=1.77; 95% confidence interval [CI]: 1.42-2.19; P=2.78 ?x 10(-7) ). This association was more pronounced in ALD (OR=2.20; 95% CI: 1.80-2.67; P=4.71 ?x 10(-15) ) than in CHC patients (OR=1.55; 95% CI: 1.03-2.34; P=3.52 ?x 10(-2) ). After adjustment for age, sex, and body mass index, the variant remained strongly associated with HCC. |
Lerman, Liran; Medeiros, Stéphane Fernandes; Bontempi, Gianluca; Markowitch, Olivier A Machine Learning Approach Against a Masked AES Journal Article In: Lecture notes in computer science, 8419 , pp. 61-75, 2014, (Language of publication: en). @article{info:hdl:2013/223763, title = {A Machine Learning Approach Against a Masked AES}, author = {Liran Lerman and Stéphane Fernandes Medeiros and Gianluca Bontempi and Olivier Markowitch}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/223763}, year = {2014}, date = {2014-01-01}, journal = {Lecture notes in computer science}, volume = {8419}, pages = {61-75}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Lopes, Miguel; Kutlu, Burak; Miani, MICHELA; Bang-Berthelsen, Claus H; Størling, Joachim; Pociot, Flemming; Goodman, Nathan; Hood, Lee; Welsh, Nils; Bontempi, Gianluca; Eizirik, Decio L Temporal profiling of cytokine-induced genes in pancreatic beta-cells by meta-analysis and network inference. Journal Article In: Genomics, 2014, (DOI: 10.1016/j.ygeno.2013.12.007). @article{info:hdl:2013/163447, title = {Temporal profiling of cytokine-induced genes in pancreatic beta-cells by meta-analysis and network inference.}, author = {Miguel Lopes and Burak Kutlu and MICHELA Miani and Claus H Bang-Berthelsen and Joachim Størling and Flemming Pociot and Nathan Goodman and Lee Hood and Nils Welsh and Gianluca Bontempi and Decio L Eizirik}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/163447/1/Elsevier_147077.pdf}, year = {2014}, date = {2014-01-01}, journal = {Genomics}, abstract = {Type 1 Diabetes (T1D) is an autoimmune disease where local release of cytokines such as IL-1β and IFN-γ contributes to β-cell apoptosis. To identify relevant genes regulating this process we performed a meta-analysis of 8 datasets of β-cell gene expression after exposure to IL-1β and IFN-γ. Two of these datasets are novel and contain time-series expressions in human islet cells and rat INS-1E cells. Genes were ranked according to their differential expression within and after 24h from exposure, and characterized by function and prior knowledge in the literature. A regulatory network was then inferred from the human time expression datasets, using a time-series extension of a network inference method. The two most differentially expressed genes previously unknown in T1D literature (RIPK2 and ELF3) were found to modulate cytokine-induced apoptosis. The inferred regulatory network is thus supported by the experimental validation, providing a proof-of-concept for the proposed statistical inference approach.}, note = {DOI: 10.1016/j.ygeno.2013.12.007}, keywords = {}, pubstate = {published}, tppubtype = {article} } Type 1 Diabetes (T1D) is an autoimmune disease where local release of cytokines such as IL-1β and IFN-γ contributes to β-cell apoptosis. To identify relevant genes regulating this process we performed a meta-analysis of 8 datasets of β-cell gene expression after exposure to IL-1β and IFN-γ. Two of these datasets are novel and contain time-series expressions in human islet cells and rat INS-1E cells. Genes were ranked according to their differential expression within and after 24h from exposure, and characterized by function and prior knowledge in the literature. A regulatory network was then inferred from the human time expression datasets, using a time-series extension of a network inference method. The two most differentially expressed genes previously unknown in T1D literature (RIPK2 and ELF3) were found to modulate cytokine-induced apoptosis. The inferred regulatory network is thus supported by the experimental validation, providing a proof-of-concept for the proposed statistical inference approach. |
Lerman, Liran; Bontempi, Gianluca; Markowitch, Olivier Power analysis attack: an approach based on machine learning Journal Article In: International Journal of Applied Cryptography, 3 (2), pp. 97-115, 2014, (DOI: 10.1504/IJACT.2014.062722). @article{info:hdl:2013/163770, title = {Power analysis attack: an approach based on machine learning}, author = {Liran Lerman and Gianluca Bontempi and Olivier Markowitch}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/163770}, year = {2014}, date = {2014-01-01}, journal = {International Journal of Applied Cryptography}, volume = {3}, number = {2}, pages = {97-115}, abstract = {In cryptography, a side-channel attack is any attack based on the analysis of measurements related to the physical implementation of a cryptosystem. Nowadays, the possibility of collecting a large amount of observations paves the way to the adoption of machine learning techniques, i.e., techniques able to extract information and patterns from large datasets. The use of statistical techniques for side-channel attacks is not new. Techniques like the template attack have shown their effectiveness in recent years. However, these techniques rely on parametric assumptions and are often limited to small dimensionality settings, which limit their range of application. This paper explores the use of machine learning techniques to relax such assumptions and to deal with high dimensional feature vectors. © 2014 Inderscience Enterprises Ltd.}, note = {DOI: 10.1504/IJACT.2014.062722}, keywords = {}, pubstate = {published}, tppubtype = {article} } In cryptography, a side-channel attack is any attack based on the analysis of measurements related to the physical implementation of a cryptosystem. Nowadays, the possibility of collecting a large amount of observations paves the way to the adoption of machine learning techniques, i.e., techniques able to extract information and patterns from large datasets. The use of statistical techniques for side-channel attacks is not new. Techniques like the template attack have shown their effectiveness in recent years. However, these techniques rely on parametric assumptions and are often limited to small dimensionality settings, which limit their range of application. This paper explores the use of machine learning techniques to relax such assumptions and to deal with high dimensional feature vectors. © 2014 Inderscience Enterprises Ltd. |
Olsen, Catharina; Fleming, Kathleen; Prendergast, Niall; Rubio, Renee; Emmert-Streib, Frank; Bontempi, Gianluca; Haibe-Kains, Benjamin; Quackenbush, John Inference and validation of predictive gene networks from biomedical literature and gene expression data Journal Article In: Genomics, 103 , pp. 329-336, 2014, (DOI: 10.1016/j.ygeno.2014.03.004). @article{info:hdl:2013/172095, title = {Inference and validation of predictive gene networks from biomedical literature and gene expression data}, author = {Catharina Olsen and Kathleen Fleming and Niall Prendergast and Renee Rubio and Frank Emmert-Streib and Gianluca Bontempi and Benjamin Haibe-Kains and John Quackenbush}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/172095/1/Elsevier_155725.pdf}, year = {2014}, date = {2014-01-01}, journal = {Genomics}, volume = {103}, pages = {329-336}, abstract = {Although many methods have been developed for inference of biological networks, the validation of the resulting models has largely remained an unsolved problem. Here we present a framework for quantitative assessment of inferred gene interaction networks using knock-down data from cell line experiments. Using this framework we are able to show that network inference based on integration of prior knowledge derived from the biomedical literature with genomic data significantly improves the quality of inferred networks relative to other approaches. Our results also suggest that cell line experiments can be used to quantitatively assess the quality of networks inferred from tumor samples. © 2014.}, note = {DOI: 10.1016/j.ygeno.2014.03.004}, keywords = {}, pubstate = {published}, tppubtype = {article} } Although many methods have been developed for inference of biological networks, the validation of the resulting models has largely remained an unsolved problem. Here we present a framework for quantitative assessment of inferred gene interaction networks using knock-down data from cell line experiments. Using this framework we are able to show that network inference based on integration of prior knowledge derived from the biomedical literature with genomic data significantly improves the quality of inferred networks relative to other approaches. Our results also suggest that cell line experiments can be used to quantitatively assess the quality of networks inferred from tumor samples. © 2014. |
Olsen, Catharina; Bontempi, Gianluca; Emmert-Streib, Frank; Quackenbush, John; Haibe-Kains, Benjamin Relevance of different prior knowledge sources for inferring gene interaction networks Journal Article In: Frontiers in Genetics, 5 (177), 2014, (DOI: 10.3389/fgene.2014.00177). @article{info:hdl:2013/172097, title = {Relevance of different prior knowledge sources for inferring gene interaction networks}, author = {Catharina Olsen and Gianluca Bontempi and Frank Emmert-Streib and John Quackenbush and Benjamin Haibe-Kains}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/172097/3/PMC4067568.pdf}, year = {2014}, date = {2014-01-01}, journal = {Frontiers in Genetics}, volume = {5}, number = {177}, abstract = {When inferring networks from high-throughput genomic data, one of the main challenges is the subsequent validation of these networks. In the best case scenario, the true network is partially known from previous research results published in structured databases or research articles. Traditionally, inferred networks are validated against these known interactions. Whenever the recovery rate is gauged to be high enough, subsequent high scoring but unknown inferred interactions are deemed good candidates for further experimental validation. Therefore such validation framework strongly depends on the quantity and quality of published interactions and presents serious pitfalls: (1) availability of these known interactions for the studied problem might be sparse; (2) quantitatively comparing different inference algorithms is not trivial; and (3) the use of these known interactions for validation prevents their integration in the inference procedure. The latter is particularly relevant as it has recently been showed that integration of priors during network inference significantly improves the quality of inferred networks. To overcome these problems when validating inferred networks, we recently proposed a data-driven validation framework based on single gene knock-down experiments. Using this framework, we were able to demonstrate the benefits of integrating prior knowledge and expression data. In this paper we used this framework to assess the quality of different sources of prior knowledge on their own and in combination with different genomic data sets in colorectal cancer. We observed that most prior sources lead to significant F -scores. Furthermore, their integration with genomic data leads to a significant increase in F -scores, especially for priors extracted from full text PubMed articles, known co-expression modules and genetic interactions. Lastly, we observed that the results are consistent for three different data sets: experimental knock-down data and two human tumor data sets. © 2014 Olsen, Bontempi, Emmert-Streib, Quackenbush and Haibe-Kains.}, note = {DOI: 10.3389/fgene.2014.00177}, keywords = {}, pubstate = {published}, tppubtype = {article} } When inferring networks from high-throughput genomic data, one of the main challenges is the subsequent validation of these networks. In the best case scenario, the true network is partially known from previous research results published in structured databases or research articles. Traditionally, inferred networks are validated against these known interactions. Whenever the recovery rate is gauged to be high enough, subsequent high scoring but unknown inferred interactions are deemed good candidates for further experimental validation. Therefore such validation framework strongly depends on the quantity and quality of published interactions and presents serious pitfalls: (1) availability of these known interactions for the studied problem might be sparse; (2) quantitatively comparing different inference algorithms is not trivial; and (3) the use of these known interactions for validation prevents their integration in the inference procedure. The latter is particularly relevant as it has recently been showed that integration of priors during network inference significantly improves the quality of inferred networks. To overcome these problems when validating inferred networks, we recently proposed a data-driven validation framework based on single gene knock-down experiments. Using this framework, we were able to demonstrate the benefits of integrating prior knowledge and expression data. In this paper we used this framework to assess the quality of different sources of prior knowledge on their own and in combination with different genomic data sets in colorectal cancer. We observed that most prior sources lead to significant F -scores. Furthermore, their integration with genomic data leads to a significant increase in F -scores, especially for priors extracted from full text PubMed articles, known co-expression modules and genetic interactions. Lastly, we observed that the results are consistent for three different data sets: experimental knock-down data and two human tumor data sets. © 2014 Olsen, Bontempi, Emmert-Streib, Quackenbush and Haibe-Kains. |
Lopes, Miguel; Bontempi, Gianluca On the null distribution of the precision and recall curve Journal Article In: Lecture notes in computer science, 8725 LNAI (PART 2), pp. 322-337, 2014, (DOI: 10.1007/978-3-662-44851-9_21). @article{info:hdl:2013/187934, title = {On the null distribution of the precision and recall curve}, author = {Miguel Lopes and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/187934}, year = {2014}, date = {2014-01-01}, journal = {Lecture notes in computer science}, volume = {8725 LNAI}, number = {PART 2}, pages = {322-337}, abstract = {Precision recall curves (pr-curves) and the associated area under (AUPRC) are commonly used to assess the accuracy of information retrieval (IR) algorithms. An informative baseline is random selection. The associated probability distribution makes it possible to assess pr-curve significancy (as a p-value relative to the null of random). To our knowledge, no analytical expression of the null distribution of empirical pr-curves is available, and the only measure of significancy used in the literature relies on non-parametric Monte Carlo simulations. In this paper, we derive analytically the expected null pr-curve and AUPRC, for different interpolation strategies. The AUPRC variance is also derived, and we use it to propose a continuous approximation to the null AUPRC distribution, based on the beta distribution. Properties of the empirical pr-curve and common interpolation strategies are also discussed. © 2014 Springer-Verlag.}, note = {DOI: 10.1007/978-3-662-44851-9_21}, keywords = {}, pubstate = {published}, tppubtype = {article} } Precision recall curves (pr-curves) and the associated area under (AUPRC) are commonly used to assess the accuracy of information retrieval (IR) algorithms. An informative baseline is random selection. The associated probability distribution makes it possible to assess pr-curve significancy (as a p-value relative to the null of random). To our knowledge, no analytical expression of the null distribution of empirical pr-curves is available, and the only measure of significancy used in the literature relies on non-parametric Monte Carlo simulations. In this paper, we derive analytically the expected null pr-curve and AUPRC, for different interpolation strategies. The AUPRC variance is also derived, and we use it to propose a continuous approximation to the null AUPRC distribution, based on the beta distribution. Properties of the empirical pr-curve and common interpolation strategies are also discussed. © 2014 Springer-Verlag. |
Pozzolo, Andrea Dal; "e, Yann-A; Bontempi, Gianluca; Caelen, Olivier; Waterschoot, Serge Learned lessons in credit card fraud detection from a practitioner perspective Journal Article In: Expert systems with applications, 41 (10), pp. 4915-4928, 2014, (DOI: 10.1016/j.eswa.2014.02.026). @article{info:hdl:2013/183504, title = {Learned lessons in credit card fraud detection from a practitioner perspective}, author = {Andrea Dal Pozzolo and Yann-A{"e}l Le Borgne and Gianluca Bontempi and Olivier Caelen and Serge Waterschoot}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/183504/1/Elsevier_167131.pdf}, year = {2014}, date = {2014-01-01}, journal = {Expert systems with applications}, volume = {41}, number = {10}, pages = {4915-4928}, abstract = {Billions of dollars of loss are caused every year due to fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to non-stationary distribution of the data, highly imbalanced classes distributions and continuous streams of transactions. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about which is the best strategy to deal with them. In this paper we provide some answers from the practitioner's perspective by focusing on three crucial issues: unbalancedness, non-stationarity and assessment. The analysis is made possible by a real credit card dataset provided by our industrial partner. © 2014 Elsevier Ltd. All rights reserved.}, note = {DOI: 10.1016/j.eswa.2014.02.026}, keywords = {}, pubstate = {published}, tppubtype = {article} } Billions of dollars of loss are caused every year due to fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to non-stationary distribution of the data, highly imbalanced classes distributions and continuous streams of transactions. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about which is the best strategy to deal with them. In this paper we provide some answers from the practitioner's perspective by focusing on three crucial issues: unbalancedness, non-stationarity and assessment. The analysis is made possible by a real credit card dataset provided by our industrial partner. © 2014 Elsevier Ltd. All rights reserved. |
Pozzolo, Andrea Dal; Johnson, Reid; Caelen, Olivier; Waterschoot, Serge S; Chawla, Nitesh V; Bontempi, Gianluca Using HDDT to avoid instances propagation in unbalanced and evolving data streams Inproceedings In: Neural Networks (IJCNN), 2014 International Joint Conference on, pp. 588-594, 2014, (Language of publication: en). @inproceedings{info:hdl:2013/221667, title = {Using HDDT to avoid instances propagation in unbalanced and evolving data streams}, author = {Andrea Dal Pozzolo and Reid Johnson and Olivier Caelen and Serge S Waterschoot and Nitesh V Chawla and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/221667}, year = {2014}, date = {2014-01-01}, booktitle = {Neural Networks (IJCNN), 2014 International Joint Conference on}, pages = {588-594}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim DynaMine: Sequence-based Protein Backbone Dynamics and Disorder Prediction Miscellaneous 2014, (Conference: 11th Bioinformatics ITalian Society annual meeting(26-28 February 2014: Rome, Italy)). @misc{info:hdl:2013/243667, title = {DynaMine: Sequence-based Protein Backbone Dynamics and Disorder Prediction}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243667}, year = {2014}, date = {2014-01-01}, note = {Conference: 11th Bioinformatics ITalian Society annual meeting(26-28 February 2014: Rome, Italy)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gazzo, Andrea; Daneels, Dorien; Smits, Guillaume; Dooren, Sonia Van; Cilia, Elisa; Lenaerts, Tom DIDA: A first database on digenic diseases Miscellaneous 2014, (Conference: Benelux Bioinformatics Conference(8-9 December 2014: Luxembourg)). @misc{info:hdl:2013/243674, title = {DIDA: A first database on digenic diseases}, author = {Andrea Gazzo and Dorien Daneels and Guillaume Smits and Sonia Van Dooren and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243674}, year = {2014}, date = {2014-01-01}, note = {Conference: Benelux Bioinformatics Conference(8-9 December 2014: Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Han, The Anh T A H; Pereira, Luís Moniz; Santos, Francisco C; Lenaerts, Tom Evolution of Pairwise Commitment and Cooperation Miscellaneous 2014, (Conference: European Conference on Complex Systems(22-26 September 2014: Lucca, Italy)). @misc{info:hdl:2013/243669, title = {Evolution of Pairwise Commitment and Cooperation}, author = {The Anh T A H Han and Luís Moniz Pereira and Francisco C Santos and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243669}, year = {2014}, date = {2014-01-01}, note = {Conference: European Conference on Complex Systems(22-26 September 2014: Lucca, Italy)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Martinez-Vaquero, Luis L A; Han, The Anh T A H; Lenaerts, Tom Promotion of cooperation through commitments in repeated games Miscellaneous 2014, (Conference: Spatial Human Cooperation: from theory to experiments and back(26-28 May 2014: Plon, Germany)). @misc{info:hdl:2013/243673, title = {Promotion of cooperation through commitments in repeated games}, author = {Luis L A Martinez-Vaquero and The Anh T A H Han and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243673}, year = {2014}, date = {2014-01-01}, note = {Conference: Spatial Human Cooperation: from theory to experiments and back(26-28 May 2014: Plon, Germany)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Huculeci, Radu; Cilia, Elisa; Buts, Lieven; Houben, Klaartje; van Nuland, Nico A J; Lenaerts, Tom Unravelling the Intra-Protein Communication Pathway within the Fyn SH2 Domain Miscellaneous 2014, (Conference: Annual VIB Seminar(29-30 April 2014: Blankenberge, Belgium)). @misc{info:hdl:2013/243671, title = {Unravelling the Intra-Protein Communication Pathway within the Fyn SH2 Domain}, author = {Radu Huculeci and Elisa Cilia and Lieven Buts and Klaartje Houben and Nico A J van Nuland and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243671}, year = {2014}, date = {2014-01-01}, note = {Conference: Annual VIB Seminar(29-30 April 2014: Blankenberge, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Vranken, Wim; Lenaerts, Tom; Cilia, Elisa Deriving networks of reciprocal amino acid influences on dynamics and secondary structure Miscellaneous 2014, (Conference: BeNeLux Bioinformatics Conference(8-9 December 2014: Luxembourg)). @misc{info:hdl:2013/243675, title = {Deriving networks of reciprocal amino acid influences on dynamics and secondary structure}, author = {Wim Vranken and Tom Lenaerts and Elisa Cilia}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243675}, year = {2014}, date = {2014-01-01}, note = {Conference: BeNeLux Bioinformatics Conference(8-9 December 2014: Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Raimondi, Daniele; Lenaerts, Tom; Rooman, Marianne; Vranken, Wim Decriminative sequence alignments for the prediction of snps and indels functions effects on proteins Miscellaneous 2014, (Conference: Benelux Bioinformatics Conference(8-9 December 2014: Luxembourg)). @misc{info:hdl:2013/243672, title = {Decriminative sequence alignments for the prediction of snps and indels functions effects on proteins}, author = {Daniele Raimondi and Tom Lenaerts and Marianne Rooman and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243672}, year = {2014}, date = {2014-01-01}, note = {Conference: Benelux Bioinformatics Conference(8-9 December 2014: Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Han, The Anh T A H; Pereira, Luís Moniz; Lenaerts, Tom; Santos, Francisco C Learning to Recognize Intentions Resolves Cooperation dilemma Miscellaneous 2014, (Conference: 23rd annual Belgian-Dutch Conference on Machine Learning(6 june 2014: Brussels, Belgium)). @misc{info:hdl:2013/243668, title = {Learning to Recognize Intentions Resolves Cooperation dilemma}, author = {The Anh T A H Han and Luís Moniz Pereira and Tom Lenaerts and Francisco C Santos}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243668}, year = {2014}, date = {2014-01-01}, note = {Conference: 23rd annual Belgian-Dutch Conference on Machine Learning(6 june 2014: Brussels, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Conard, Ashley M; Cilia, Elisa; Lenaerts, Tom 2014, (Conference: Benelux Bioinformatics Conference(9th: 8-9 December 2014: Luxembourg)). @misc{info:hdl:2013/243676, title = {Determining the winning SH3 coalition: how cooperative game theory reveals the importance of domain residues in peptide binding}, author = {Ashley M Conard and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243676}, year = {2014}, date = {2014-01-01}, note = {Conference: Benelux Bioinformatics Conference(9th: 8-9 December 2014: Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F Dynamine: a web-server for predicting protein dynamics from sequence Miscellaneous 2014, (Conference: The 13th European Conference on Computational Biology(7-10 September 2014: Strassbourg, France)). @misc{info:hdl:2013/243677, title = {Dynamine: a web-server for predicting protein dynamics from sequence}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243677}, year = {2014}, date = {2014-01-01}, note = {Conference: The 13th European Conference on Computational Biology(7-10 September 2014: Strassbourg, France)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Cilia, Elisa; Raimondi, Daniele; Lenaerts, Tom; Vranken, Wim Applying dynamics-based interaction potentials in a residue network Miscellaneous 2014, (Conference: 13th European Conference on Computational Biology(7-10 September 2014: Strassbourg, france)). @misc{info:hdl:2013/243678, title = {Applying dynamics-based interaction potentials in a residue network}, author = {Elisa Cilia and Daniele Raimondi and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243678}, year = {2014}, date = {2014-01-01}, note = {Conference: 13th European Conference on Computational Biology(7-10 September 2014: Strassbourg, france)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Boes, Olivier; Lenaerts, Tom Improving the Needleman-Wunsch algorithm with the Dynamine predictor. Informatique Masters Thesis 2014, (Language of publication: fr). @mastersthesis{info:hdl:2013/243957, title = {Improving the Needleman-Wunsch algorithm with the Dynamine predictor. Informatique}, author = {Olivier Boes and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243957}, year = {2014}, date = {2014-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Wintjens, Florian; Lenaerts, Tom Machine Learning: Coalition-based naive bayesian classification. Informatique Masters Thesis 2014, (Language of publication: fr). @mastersthesis{info:hdl:2013/243955, title = {Machine Learning: Coalition-based naive bayesian classification. Informatique}, author = {Florian Wintjens and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243955}, year = {2014}, date = {2014-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Estievenart, Quentin; Lenaerts, Tom Predicting secondary structure with dynamine. Informatique Masters Thesis 2014, (Language of publication: fr). @mastersthesis{info:hdl:2013/243956, title = {Predicting secondary structure with dynamine. Informatique}, author = {Quentin Estievenart and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243956}, year = {2014}, date = {2014-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Reggiani, Claudio; "e, Yann-A; Pozzolo, Andrea Dal; Olsen, Catharina; Bontempi, Gianluca Minimum Redundancy Maximum Relevance: MapReduce implementation using Apache Hadoop Miscellaneous 2014, (Conference: Benelearn 2014). @misc{info:hdl:2013/184925, title = {Minimum Redundancy Maximum Relevance: MapReduce implementation using Apache Hadoop}, author = {Claudio Reggiani and Yann-A{"e}l Le Borgne and Andrea Dal Pozzolo and Catharina Olsen and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/184925}, year = {2014}, date = {2014-01-01}, note = {Conference: Benelearn 2014}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
2013 |
Olsen, Catharina Causal inference and prior integration in bioinformatics using information theory PhD Thesis 2013, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209401, title = {Causal inference and prior integration in bioinformatics using information theory}, author = {Catharina Olsen}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209401/1/ba9583ce-51e9-4718-b438-fb816d60aea4.txt}, year = {2013}, date = {2013-01-01}, abstract = {An important problem in bioinformatics is the reconstruction of gene regulatory networks from expression data. The analysis of genomic data stemming from high- throughput technologies such as microarray experiments or RNA-sequencing faces several difficulties. The first major issue is the high variable to sample ratio which is due to a number of factors: a single experiment captures all genes while the number of experiments is restricted by the experiment’s cost, time and patient cohort size. The second problem is that these data sets typically exhibit high amounts of noise. Another important problem in bioinformatics is the question of how the inferred networks’ quality can be evaluated. The current best practice is a two step procedure. In the first step, the highest scoring interactions are compared to known interactions stored in biological databases. The inferred networks passes this quality assessment if there is a large overlap with the known interactions. In this case, a second step is carried out in which unknown but high scoring and thus promising new interactions are validated ’by hand’ via laboratory experiments. Unfortunately when integrating prior knowledge in the inference procedure, this validation procedure would be biased by using the same information in both the inference and the validation. Therefore, it would no longer allow an independent validation of the resulting network. The main contribution of this thesis is a complete computational framework that uses experimental knock down data in a cross-validation scheme to both infer and validate directed networks. Its components are i) a method that integrates genomic data and prior knowledge to infer directed networks, ii) its implementation in an R/Bioconductor package and iii) a web application to retrieve prior knowledge from PubMed abstracts and biological databases. To infer directed networks from genomic data and prior knowledge, we propose a two step procedure: First, we adapt the pairwise feature selection strategy mRMR to integrate prior knowledge in order to obtain the network’s skeleton. Then for the subsequent orientation phase of the algorithm, we extend a criterion based on interaction information to include prior knowledge. The implementation of this method is available both as part of the prior retrieval tool Predictive Networks and as a stand-alone R/Bioconductor package named predictionet. Furthermore, we propose a fully data-driven quantitative validation of such directed networks using experimental knock-down data: We start by identifying the set of genes that was truly affected by the perturbation experiment. The rationale of our validation procedure is that these truly affected genes should also be part of the perturbed gene’s childhood in the inferred network. Consequently, we can compute a performance score}, An important problem in bioinformatics is the reconstruction of gene regulatory networks from expression data. The analysis of genomic data stemming from high- throughput technologies such as microarray experiments or RNA-sequencing faces several difficulties. The first major issue is the high variable to sample ratio which is due to a number of factors: a single experiment captures all genes while the number of experiments is restricted by the experiment’s cost, time and patient cohort size. The second problem is that these data sets typically exhibit high amounts of noise.<p><p>Another important problem in bioinformatics is the question of how the inferred networks’ quality can be evaluated. The current best practice is a two step procedure. In the first step, the highest scoring interactions are compared to known interactions stored in biological databases. The inferred networks passes this quality assessment if there is a large overlap with the known interactions. In this case, a second step is carried out in which unknown but high scoring and thus promising new interactions are validated ’by hand’ via laboratory experiments. Unfortunately when integrating prior knowledge in the inference procedure, this validation procedure would be biased by using the same information in both the inference and the validation. Therefore, it would no longer allow an independent validation of the resulting network.<p><p>The main contribution of this thesis is a complete computational framework that uses experimental knock down data in a cross-validation scheme to both infer and validate directed networks. Its components are i) a method that integrates genomic data and prior knowledge to infer directed networks, ii) its implementation in an R/Bioconductor package and iii) a web application to retrieve prior knowledge from PubMed abstracts and biological databases. To infer directed networks from genomic data and prior knowledge, we propose a two step procedure: First, we adapt the pairwise feature selection strategy mRMR to integrate prior knowledge in order to obtain the network’s skeleton. Then for the subsequent orientation phase of the algorithm, we extend a criterion based on interaction information to include prior knowledge. The implementation of this method is available both as part of the prior retrieval tool Predictive Networks and as a stand-alone R/Bioconductor package named predictionet.<p><p>Furthermore, we propose a fully data-driven quantitative validation of such directed networks using experimental knock-down data: We start by identifying the set of genes that was truly affected by the perturbation experiment. The rationale of our validation procedure is that these truly affected genes should also be part of the perturbed gene’s childhood in the inferred network. Consequently, we can compute a performance score |
Bontempi, Gianluca; Taieb, Souhaib Ben Statistical foundations of machine learning Book Otexts, Online Books, 2013, (Language of publication: fr). @book{info:hdl:2013/223362, title = {Statistical foundations of machine learning}, author = {Gianluca Bontempi and Souhaib Ben Taieb}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/223362}, year = {2013}, date = {2013-01-01}, publisher = {Otexts, Online Books}, series = {Otexts}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {book} } |
Olsen, Catharina; Haibe-Kains, Benjamin; Quackenbush, John; Bontempi, Gianluca On the Integration of Prior Knowledge in the Inference of Regulatory Networks. Book Chapter In: World Scientific, 2013, (Language of publication: fr). @inbook{info:hdl:2013/223360, title = {On the Integration of Prior Knowledge in the Inference of Regulatory Networks.}, author = {Catharina Olsen and Benjamin Haibe-Kains and John Quackenbush and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/223360}, year = {2013}, date = {2013-01-01}, publisher = {World Scientific}, series = {Biological Data Mining and Its Applications in Healthcare}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } |
Olsen, Catharina Causal inference and prior integration in bioinformatics using information theory PhD Thesis 2013, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209401b, title = {Causal inference and prior integration in bioinformatics using information theory}, author = {Catharina Olsen}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209401/1/ba9583ce-51e9-4718-b438-fb816d60aea4.txt}, year = {2013}, date = {2013-01-01}, abstract = {An important problem in bioinformatics is the reconstruction of gene regulatory networks from expression data. The analysis of genomic data stemming from high- throughput technologies such as microarray experiments or RNA-sequencing faces several difficulties. The first major issue is the high variable to sample ratio which is due to a number of factors: a single experiment captures all genes while the number of experiments is restricted by the experiment’s cost, time and patient cohort size. The second problem is that these data sets typically exhibit high amounts of noise. Another important problem in bioinformatics is the question of how the inferred networks’ quality can be evaluated. The current best practice is a two step procedure. In the first step, the highest scoring interactions are compared to known interactions stored in biological databases. The inferred networks passes this quality assessment if there is a large overlap with the known interactions. In this case, a second step is carried out in which unknown but high scoring and thus promising new interactions are validated ’by hand’ via laboratory experiments. Unfortunately when integrating prior knowledge in the inference procedure, this validation procedure would be biased by using the same information in both the inference and the validation. Therefore, it would no longer allow an independent validation of the resulting network. The main contribution of this thesis is a complete computational framework that uses experimental knock down data in a cross-validation scheme to both infer and validate directed networks. Its components are i) a method that integrates genomic data and prior knowledge to infer directed networks, ii) its implementation in an R/Bioconductor package and iii) a web application to retrieve prior knowledge from PubMed abstracts and biological databases. To infer directed networks from genomic data and prior knowledge, we propose a two step procedure: First, we adapt the pairwise feature selection strategy mRMR to integrate prior knowledge in order to obtain the network’s skeleton. Then for the subsequent orientation phase of the algorithm, we extend a criterion based on interaction information to include prior knowledge. The implementation of this method is available both as part of the prior retrieval tool Predictive Networks and as a stand-alone R/Bioconductor package named predictionet. Furthermore, we propose a fully data-driven quantitative validation of such directed networks using experimental knock-down data: We start by identifying the set of genes that was truly affected by the perturbation experiment. The rationale of our validation procedure is that these truly affected genes should also be part of the perturbed gene’s childhood in the inferred network. Consequently, we can compute a performance score}, An important problem in bioinformatics is the reconstruction of gene regulatory networks from expression data. The analysis of genomic data stemming from high- throughput technologies such as microarray experiments or RNA-sequencing faces several difficulties. The first major issue is the high variable to sample ratio which is due to a number of factors: a single experiment captures all genes while the number of experiments is restricted by the experiment’s cost, time and patient cohort size. The second problem is that these data sets typically exhibit high amounts of noise.<p><p>Another important problem in bioinformatics is the question of how the inferred networks’ quality can be evaluated. The current best practice is a two step procedure. In the first step, the highest scoring interactions are compared to known interactions stored in biological databases. The inferred networks passes this quality assessment if there is a large overlap with the known interactions. In this case, a second step is carried out in which unknown but high scoring and thus promising new interactions are validated ’by hand’ via laboratory experiments. Unfortunately when integrating prior knowledge in the inference procedure, this validation procedure would be biased by using the same information in both the inference and the validation. Therefore, it would no longer allow an independent validation of the resulting network.<p><p>The main contribution of this thesis is a complete computational framework that uses experimental knock down data in a cross-validation scheme to both infer and validate directed networks. Its components are i) a method that integrates genomic data and prior knowledge to infer directed networks, ii) its implementation in an R/Bioconductor package and iii) a web application to retrieve prior knowledge from PubMed abstracts and biological databases. To infer directed networks from genomic data and prior knowledge, we propose a two step procedure: First, we adapt the pairwise feature selection strategy mRMR to integrate prior knowledge in order to obtain the network’s skeleton. Then for the subsequent orientation phase of the algorithm, we extend a criterion based on interaction information to include prior knowledge. The implementation of this method is available both as part of the prior retrieval tool Predictive Networks and as a stand-alone R/Bioconductor package named predictionet.<p><p>Furthermore, we propose a fully data-driven quantitative validation of such directed networks using experimental knock-down data: We start by identifying the set of genes that was truly affected by the perturbation experiment. The rationale of our validation procedure is that these truly affected genes should also be part of the perturbed gene’s childhood in the inferred network. Consequently, we can compute a performance score |
Govorun, Maria Pension and health insurance, phase-type modeling PhD Thesis 2013, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209447, title = {Pension and health insurance, phase-type modeling}, author = {Maria Govorun}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209447/1/ffb25fee-0dd2-47b6-85b3-5b9db8df6d5a.txt}, year = {2013}, date = {2013-01-01}, abstract = {Depuis longtemps les mod`eles de type phase sont utilisés dans plusieurs domaines scientifiques pour décrire des syst`emes qui peuvent ^etre caractérisés par différents états. Les mod`eles sont bien connus en théorie des files d’attentes, en économie et en assurance. La th`ese est focalisée sur différentes applications des mod`eles de type phase en assurance et montre leurs avantages. En particulier, le mod`ele de Lin et Liu en 2007 est intéressant, parce qu’il décrit le processus de vieillissement de l’organisme humain. La durée de vie d’un individu suit une loi de type phase et les états de ce mod`ele représentent des états de santé. Le fait que le mod`ele prévoit la connexion entre les états de santé et l’^age de l’individu le rend tr`es utile en assurance. Les résultats principaux de la th`ese sont des nouveaux mod`eles et méthodes en assurance pension et en assurance santé qui utilisent l’hypoth`ese de la loi de type phase pour décrire la durée de vie d’un individu. En assurance pension le but d’estimer la profitabilité d’un fonds de pension. Pour cette raison, on construit un mod`ele « profit-test » qui demande la modélisation de plusieurs caractéristiques. On décrit l’évolution des participants du fonds en adaptant le mod`ele du vieillissement aux causes multiples de sortie. L’estimation des profits futurs exige qu’on détermine les valeurs des cotisations pour chaque état de santé, ainsi que l’ancienneté et l’état de santé initial pour chaque participant. Cela nous permet d’obtenir la distribution de profits futurs et de développer des méthodes pour estimer les risques de longevité et de changements de marché. De plus, on suppose que la diminution des taux de mortalité pour les pensionnés influence les profits futurs plus que pour les participants actifs. C’est pourquoi, pour évaluer l’impact de changement de santé sur la profitabilité, on modélise séparément les profits venant des pensionnés. En assurance santé, on utilise le mod`ele de type phase pour calculer la distribution de la valeur actualisée des co^uts futurs de santé. On développe des algorithmes récursifs qui permettent d’évaluer la distribution au cours d’une période courte, en utilisant des mod`eles fluides en temps continu, et pendant toute la durée de vie de l’individu, en construisant des mod`eles en temps discret. Les trois mod`eles en temps discret correspondent `a des hypoth`eses différentes qu’on fait pour les co^uts: dans le premier mod`ele on suppose que les co^uts de santé sont indépendants et identiquement distribués et ne dépendent pas du vieillissement de l’individu; dans les deux autres mod`eles on suppose que les co^uts dépendent de son état de santé. }, Depuis longtemps les mod`eles de type phase sont utilisés dans plusieurs domaines scientifiques pour décrire des syst`emes qui peuvent ^etre caractérisés par différents états. Les mod`eles sont bien connus en théorie des files d’attentes, en économie et en assurance.<p><p>La th`ese est focalisée sur différentes applications des mod`eles de type phase en assurance et montre leurs avantages. En particulier, le mod`ele de Lin et Liu en 2007 est intéressant, parce qu’il décrit le processus de vieillissement de l’organisme humain. La durée de vie d’un individu suit une loi de type phase et les états de ce mod`ele représentent des états de santé. Le fait que le mod`ele prévoit la connexion entre les états de santé et l’^age de l’individu le rend tr`es utile en assurance.<p><p>Les résultats principaux de la th`ese sont des nouveaux mod`eles et méthodes en assurance pension et en assurance santé qui utilisent l’hypoth`ese de la loi de type phase pour décrire la durée de vie d’un individu.<p><p>En assurance pension le but d’estimer la profitabilité d’un fonds de pension. Pour cette raison, on construit un mod`ele « profit-test » qui demande la modélisation de plusieurs caractéristiques. On décrit l’évolution des participants du fonds en adaptant le mod`ele du vieillissement aux causes multiples de sortie. L’estimation des profits futurs exige qu’on détermine les valeurs des cotisations pour chaque état de santé, ainsi que l’ancienneté et l’état de santé initial pour chaque participant. Cela nous permet d’obtenir la distribution de profits futurs et de développer des méthodes pour estimer les risques de longevité et de changements de marché. De plus, on suppose que la diminution des taux de mortalité pour les pensionnés influence les profits futurs plus que pour les participants actifs. C’est pourquoi, pour évaluer l’impact de changement de santé sur la profitabilité, on modélise séparément les profits venant des pensionnés.<p><p>En assurance santé, on utilise le mod`ele de type phase pour calculer la distribution de la valeur actualisée des co^uts futurs de santé. On développe des algorithmes récursifs qui permettent d’évaluer la distribution au cours d’une période courte, en utilisant des mod`eles fluides en temps continu, et pendant toute la durée de vie de l’individu, en construisant des mod`eles en temps discret. Les trois mod`eles en temps discret correspondent `a des hypoth`eses différentes qu’on fait pour les co^uts: dans le premier mod`ele on suppose que les co^uts de santé sont indépendants et identiquement distribués et ne dépendent pas du vieillissement de l’individu; dans les deux autres mod`eles on suppose que les co^uts dépendent de son état de santé.<p> |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim From protein sequence to dynamics and disorder with DynaMine. Journal Article In: Nature communications, 4 , pp. 2741, 2013, (DOI: 10.1038/ncomms3741). @article{info:hdl:2013/186425, title = {From protein sequence to dynamics and disorder with DynaMine.}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/186425}, year = {2013}, date = {2013-01-01}, journal = {Nature communications}, volume = {4}, pages = {2741}, abstract = {Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine--a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5.}, note = {DOI: 10.1038/ncomms3741}, keywords = {}, pubstate = {published}, tppubtype = {article} } Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine--a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5. |
Traulsen, Arne; Lenaerts, Tom; Pacheco, Jorge M J M; Dingli, David On the dynamics of neutral mutations in a mathematical model for a homogeneous stem cell population. Journal Article In: Journal of the Royal Society, Interface / the Royal Society, 10 (79), pp. 20120810, 2013, (DOI: 10.1098/rsif.2012.0810). @article{info:hdl:2013/138049, title = {On the dynamics of neutral mutations in a mathematical model for a homogeneous stem cell population.}, author = {Arne Traulsen and Tom Lenaerts and Jorge M J M Pacheco and David Dingli}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/138049}, year = {2013}, date = {2013-01-01}, journal = {Journal of the Royal Society, Interface / the Royal Society}, volume = {10}, number = {79}, pages = {20120810}, abstract = {The theory of the clonal origin of cancer states that a tumour arises from one cell that acquires mutation(s) leading to the malignant phenotype. It is the current belief that many of these mutations give a fitness advantage to the mutant population allowing it to expand, eventually leading to disease. However, mutations that lead to such a clonal expansion need not give a fitness advantage and may in fact be neutral-or almost neutral-with respect to fitness. Such mutant clones can be eliminated or expand stochastically, leading to a malignant phenotype (disease). Mutations in haematopoietic stem cells give rise to diseases such as chronic myeloid leukaemia (CML) and paroxysmal nocturnal haemoglobinuria (PNH). Although neutral drift often leads to clonal extinction, disease is still possible, and in this case, it has important implications both for the incidence of disease and for therapy, as it may be more difficult to eliminate neutral mutations with therapy. We illustrate the consequences of such dynamics, using CML and PNH as examples. These considerations have implications for many other tumours as well.}, note = {DOI: 10.1098/rsif.2012.0810}, keywords = {}, pubstate = {published}, tppubtype = {article} } The theory of the clonal origin of cancer states that a tumour arises from one cell that acquires mutation(s) leading to the malignant phenotype. It is the current belief that many of these mutations give a fitness advantage to the mutant population allowing it to expand, eventually leading to disease. However, mutations that lead to such a clonal expansion need not give a fitness advantage and may in fact be neutral-or almost neutral-with respect to fitness. Such mutant clones can be eliminated or expand stochastically, leading to a malignant phenotype (disease). Mutations in haematopoietic stem cells give rise to diseases such as chronic myeloid leukaemia (CML) and paroxysmal nocturnal haemoglobinuria (PNH). Although neutral drift often leads to clonal extinction, disease is still possible, and in this case, it has important implications both for the incidence of disease and for therapy, as it may be more difficult to eliminate neutral mutations with therapy. We illustrate the consequences of such dynamics, using CML and PNH as examples. These considerations have implications for many other tumours as well. |
Han, The Anh T A H; Pereira, Luís Moniz; Santos, Francisco C; Lenaerts, Tom Good agreements make good friends. Journal Article In: Scientific reports, 3 , pp. 2695, 2013, (DOI: 10.1038/srep02695). @article{info:hdl:2013/155962, title = {Good agreements make good friends.}, author = {The Anh T A H Han and Luís Moniz Pereira and Francisco C Santos and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/155962/1/PMC3776200.pdf}, year = {2013}, date = {2013-01-01}, journal = {Scientific reports}, volume = {3}, pages = {2695}, abstract = {When starting a new collaborative endeavor, it pays to establish upfront how strongly your partner commits to the common goal and what compensation can be expected in case the collaboration is violated. Diverse examples in biological and social contexts have demonstrated the pervasiveness of making prior agreements on posterior compensations, suggesting that this behavior could have been shaped by natural selection. Here, we analyze the evolutionary relevance of such a commitment strategy and relate it to the costly punishment strategy, where no prior agreements are made. We show that when the cost of arranging a commitment deal lies within certain limits, substantial levels of cooperation can be achieved. Moreover, these levels are higher than that achieved by simple costly punishment, especially when one insists on sharing the arrangement cost. Not only do we show that good agreements make good friends, agreements based on shared costs result in even better outcomes.}, note = {DOI: 10.1038/srep02695}, keywords = {}, pubstate = {published}, tppubtype = {article} } When starting a new collaborative endeavor, it pays to establish upfront how strongly your partner commits to the common goal and what compensation can be expected in case the collaboration is violated. Diverse examples in biological and social contexts have demonstrated the pervasiveness of making prior agreements on posterior compensations, suggesting that this behavior could have been shaped by natural selection. Here, we analyze the evolutionary relevance of such a commitment strategy and relate it to the costly punishment strategy, where no prior agreements are made. We show that when the cost of arranging a commitment deal lies within certain limits, substantial levels of cooperation can be achieved. Moreover, these levels are higher than that achieved by simple costly punishment, especially when one insists on sharing the arrangement cost. Not only do we show that good agreements make good friends, agreements based on shared costs result in even better outcomes. |
Haibe-Kains, Benjamin; Desmedt, Christine; Leo, Angelo Di; Azambuja, Evandro; Larsimont, Denis; Selleslags, Jean; Delaloge, Suzette; Duhem, Caroline; Kains, Jean-Pierre; Carly, Birgit; Maerevoet, Marie; Vindevoghel, Anita; Rouas, Ghizlane; cc, Fran; Durbecq, Virginie; Cardoso, Fatima; Salgado, Roberto; Rovere, Rodrigo Kraft; Bontempi, Gianluca; Michiels, Stefan; Buyse, Marc; Nogaret, Jean-Marie; Qi, Yuan; Symmans, William Fraser; Pusztai, Lajos; D'Hondt, Veronique; Piccart-Gebhart, Martine; Sotiriou, Christos Genome-wide gene expression profiling to predict resistance to anthracyclines in breast cancer patients Journal Article In: Genomics Data, 1 , pp. 7-10, 2013, (DOI: 10.1016/j.gdata.2013.09.001). @article{info:hdl:2013/177704, title = {Genome-wide gene expression profiling to predict resistance to anthracyclines in breast cancer patients}, author = {Benjamin Haibe-Kains and Christine Desmedt and Angelo Di Leo and Evandro Azambuja and Denis Larsimont and Jean Selleslags and Suzette Delaloge and Caroline Duhem and Jean-Pierre Kains and Birgit Carly and Marie Maerevoet and Anita Vindevoghel and Ghizlane Rouas and Fran{cc}oise Lallemand and Virginie Durbecq and Fatima Cardoso and Roberto Salgado and Rodrigo Kraft Rovere and Gianluca Bontempi and Stefan Michiels and Marc Buyse and Jean-Marie Nogaret and Yuan Qi and William Fraser Symmans and Lajos Pusztai and Veronique D'Hondt and Martine Piccart-Gebhart and Christos Sotiriou}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/177704/1/Elsevier_161331.pdf}, year = {2013}, date = {2013-01-01}, journal = {Genomics Data}, volume = {1}, pages = {7-10}, abstract = {Validated biomarkers predictive of response/resistance to anthracyclines in breast cancer are currently lacking. The neoadjuvant Trial of Principle (TOP) study, in which patients with estrogen receptor (ER)-negative tumors were treated with anthracycline (epirubicin) monotherapy, was specifically designed to evaluate the predictive value of topoisomerase II-alpha (TOP2A) and develop a gene expression signature to identify those patients who do not benefit from anthracyclines. Here we describe in details the contents and quality controls for the gene expression and clinical data associated with the study published by Desmedt and colleagues in the Journal of Clinical Oncology in 2011 (Desmedt et al., 2011). We also provide R code to easily access the data and perform the quality controls and basic analyses relevant to this dataset. © 2013 The Authors.}, note = {DOI: 10.1016/j.gdata.2013.09.001}, keywords = {}, pubstate = {published}, tppubtype = {article} } Validated biomarkers predictive of response/resistance to anthracyclines in breast cancer are currently lacking. The neoadjuvant Trial of Principle (TOP) study, in which patients with estrogen receptor (ER)-negative tumors were treated with anthracycline (epirubicin) monotherapy, was specifically designed to evaluate the predictive value of topoisomerase II-alpha (TOP2A) and develop a gene expression signature to identify those patients who do not benefit from anthracyclines. Here we describe in details the contents and quality controls for the gene expression and clinical data associated with the study published by Desmedt and colleagues in the Journal of Clinical Oncology in 2011 (Desmedt et al., 2011). We also provide R code to easily access the data and perform the quality controls and basic analyses relevant to this dataset. © 2013 The Authors. |
Bonnechere, Bruno; Wermenbol, Vanessa; Dan, Bernard; Salvia, Patrick; "e, Yann-A; Bontempi, Gianluca; Vansummeren, Stijn; Sholukha, Victor; Moiseev, Fedor; Jansen, Bart; Rooze, Marcel; Jan, Serge Van Sint Management and interpretation of medical data related to cerebral pasly: the ICT4 Rehab project Journal Article In: European journal of paediatric neurology, 17 (1), pp. 32, 2013, (Language of publication: na). @article{info:hdl:2013/151908, title = {Management and interpretation of medical data related to cerebral pasly: the ICT4 Rehab project}, author = {Bruno Bonnechere and Vanessa Wermenbol and Bernard Dan and Patrick Salvia and Yann-A{"e}l Le Borgne and Gianluca Bontempi and Stijn Vansummeren and Victor Sholukha and Fedor Moiseev and Bart Jansen and Marcel Rooze and Serge Van Sint Jan}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/151908}, year = {2013}, date = {2013-01-01}, journal = {European journal of paediatric neurology}, volume = {17}, number = {1}, pages = {32}, note = {Language of publication: na}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Dedeurwaerder, Sarah; Defrance, Matthieu; Bizet, Martin; Calonne, Emilie; Bontempi, Gianluca; cc, Fran A comprehensive overview of Infinium Human Methylation450 data processing Journal Article In: Briefings in bioinformatics, 15 (6), pp. 929-941, 2013, (DOI: 0.1093/bib/bbt054). @article{info:hdl:2013/186990, title = {A comprehensive overview of Infinium Human Methylation450 data processing}, author = {Sarah Dedeurwaerder and Matthieu Defrance and Martin Bizet and Emilie Calonne and Gianluca Bontempi and Fran{cc}ois Fuks}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/186990}, year = {2013}, date = {2013-01-01}, journal = {Briefings in bioinformatics}, volume = {15}, number = {6}, pages = {929-941}, abstract = {Infinium HumanMethylation450 beadarray is a popular technology to explore DNA methylomes in health and disease, and there is a current explosion in the use of this technique. Despite experience acquired from gene expression microarrays, analyzing Infinium Methylation arrays appeared more complex than initially thought and several difficulties have been encountered, as those arrays display specific features that need to be taken into consideration during data processing. Here, we review several issues that have been highlighted by the scientific community, and we present an overview of the general data processing scheme and an evaluation of the different normalization methods available to date to guide the 450K users in their analysis and data interpretation.}, note = {DOI: 0.1093/bib/bbt054}, keywords = {}, pubstate = {published}, tppubtype = {article} } Infinium HumanMethylation450 beadarray is a popular technology to explore DNA methylomes in health and disease, and there is a current explosion in the use of this technique. Despite experience acquired from gene expression microarrays, analyzing Infinium Methylation arrays appeared more complex than initially thought and several difficulties have been encountered, as those arrays display specific features that need to be taken into consideration during data processing. Here, we review several issues that have been highlighted by the scientific community, and we present an overview of the general data processing scheme and an evaluation of the different normalization methods available to date to guide the 450K users in their analysis and data interpretation. |
Papillon-Cavanagh, Simon; Jay, Nicolas De; Hachem, Nehme; Olsen, Catharina; Bontempi, Gianluca; Aerts, Hugo J W L; Quackenbush, John; Haibe-Kains, Benjamin Comparison and validation of genomic predictors for anticancer drug sensitivity. Journal Article In: Journal of the American Medical Informatics Association, 20 (4), pp. 597-602, 2013, (DOI: 10.1136/amiajnl-2012-001442). @article{info:hdl:2013/145203, title = {Comparison and validation of genomic predictors for anticancer drug sensitivity.}, author = {Simon Papillon-Cavanagh and Nicolas De Jay and Nehme Hachem and Catharina Olsen and Gianluca Bontempi and Hugo J W L Aerts and John Quackenbush and Benjamin Haibe-Kains}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/145203}, year = {2013}, date = {2013-01-01}, journal = {Journal of the American Medical Informatics Association}, volume = {20}, number = {4}, pages = {597-602}, abstract = {An enduring challenge in personalized medicine lies in selecting the right drug for each individual patient. While testing of drugs on patients in large trials is the only way to assess their clinical efficacy and toxicity, we dramatically lack resources to test the hundreds of drugs currently under development. Therefore the use of preclinical model systems has been intensively investigated as this approach enables response to hundreds of drugs to be tested in multiple cell lines in parallel.}, note = {DOI: 10.1136/amiajnl-2012-001442}, keywords = {}, pubstate = {published}, tppubtype = {article} } An enduring challenge in personalized medicine lies in selecting the right drug for each individual patient. While testing of drugs on patients in large trials is the only way to assess their clinical efficacy and toxicity, we dramatically lack resources to test the hundreds of drugs currently under development. Therefore the use of preclinical model systems has been intensively investigated as this approach enables response to hundreds of drugs to be tested in multiple cell lines in parallel. |
Lerman, Liran; Medeiros, Stéphane Fernandes; Veshchikov, Nikita; Meuter, Cédric; Bontempi, Gianluca; Markowitch, Olivier Semi-Supervised Template Attack Journal Article In: Lecture Notes in Computer Science, 7864 , pp. 184-199, 2013, (DOI: 10.1007/978-3-642-40026-1_12). @article{info:hdl:2013/147304, title = {Semi-Supervised Template Attack}, author = {Liran Lerman and Stéphane Fernandes Medeiros and Nikita Veshchikov and Cédric Meuter and Gianluca Bontempi and Olivier Markowitch}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/147304}, year = {2013}, date = {2013-01-01}, journal = {Lecture Notes in Computer Science}, volume = {7864}, pages = {184-199}, abstract = {Side channel attacks take advantage of information leakagesin cryptographic devices. Template attacks form a family of side channelattacks which is reputed to be extremely effective. This kind of attacksassumes that the attacker fully controls a cryptographic device before at-tacking a similar one. In this paper, we propose to relax this assumption bygeneralizing the template attack using a method based on a semi-supervisedlearning strategy. The effectiveness of our proposal is confirmed by softwaresimulations, by experiments on a 8-bit microcontroller and by a comparisonto a template attack as well as to two supervised machine learning methods.}, note = {DOI: 10.1007/978-3-642-40026-1_12}, keywords = {}, pubstate = {published}, tppubtype = {article} } Side channel attacks take advantage of information leakagesin cryptographic devices. Template attacks form a family of side channelattacks which is reputed to be extremely effective. This kind of attacksassumes that the attacker fully controls a cryptographic device before at-tacking a similar one. In this paper, we propose to relax this assumption bygeneralizing the template attack using a method based on a semi-supervisedlearning strategy. The effectiveness of our proposal is confirmed by softwaresimulations, by experiments on a 8-bit microcontroller and by a comparisonto a template attack as well as to two supervised machine learning methods. |
Jay, Nicolas De; Papillon-Cavanagh, Simon; Olsen, Catharina; El-Hachem, N; Bontempi, Gianluca; Haibe-Kains, Benjamin mRMRe: an R package for parallelized mRMR ensemble feature selection Journal Article In: Bioinformatics, 2013, (DOI: 10.1093/bioinformatics/btt383). @article{info:hdl:2013/155448, title = {mRMRe: an R package for parallelized mRMR ensemble feature selection}, author = {Nicolas De Jay and Simon Papillon-Cavanagh and Catharina Olsen and N El-Hachem and Gianluca Bontempi and Benjamin Haibe-Kains}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/155448}, year = {2013}, date = {2013-01-01}, journal = {Bioinformatics}, abstract = {Motivation: Feature selection is one of the main challenges in analyzing high-throughput genomic data. Minimum redundancy maximum relevance (mRMR) is a particularly fast feature selection method for finding a set of both relevant and complementary features. Here we describe the mRMRe R package, in which the mRMR technique is extended by using an ensemble approach in order to better explore the feature space and build more robust predictors. To deal with the computational complexity of the ensemble approach the main functions of the package are implemented and parallelized in C using the openMP API.Results: Our ensemble mRMR implementations outperform the classical mRMR approach in terms of prediction accuracy. They identify genes more relevant to the biological context and may lead to richer biological interpretations. The parallelized functions included in the package show significant gains in terms of run-time speed when compared to previously released packages.Availability: The R package mRMRe is available on CRAN and is provided open source under the Artistic-2.0 License. The code used to generate all the results reported in this application note is available from Supplementary File 1.Contact: bhaibeka@ircm.qc.caSupplementary Information: Supplementary information is available at Bioinformatics online.}, note = {DOI: 10.1093/bioinformatics/btt383}, keywords = {}, pubstate = {published}, tppubtype = {article} } Motivation: Feature selection is one of the main challenges in analyzing high-throughput genomic data. Minimum redundancy maximum relevance (mRMR) is a particularly fast feature selection method for finding a set of both relevant and complementary features. Here we describe the mRMRe R package, in which the mRMR technique is extended by using an ensemble approach in order to better explore the feature space and build more robust predictors. To deal with the computational complexity of the ensemble approach the main functions of the package are implemented and parallelized in C using the openMP API.Results: Our ensemble mRMR implementations outperform the classical mRMR approach in terms of prediction accuracy. They identify genes more relevant to the biological context and may lead to richer biological interpretations. The parallelized functions included in the package show significant gains in terms of run-time speed when compared to previously released packages.Availability: The R package mRMRe is available on CRAN and is provided open source under the Artistic-2.0 License. The code used to generate all the results reported in this application note is available from Supplementary File 1.Contact: bhaibeka@ircm.qc.caSupplementary Information: Supplementary information is available at Bioinformatics online. |
Bontempi, Gianluca; Taieb, Souhaib Ben; "e, Yann-A Machine learning strategies for time series forecasting Journal Article In: Lecture Notes in Business Information Processing, 138 LNBIP , pp. 62-77, 2013, (DOI: 10.1007/978-3-642-36318-4_3). @article{info:hdl:2013/167761, title = {Machine learning strategies for time series forecasting}, author = {Gianluca Bontempi and Souhaib Ben Taieb and Yann-A{"e}l Le Borgne}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/167761}, year = {2013}, date = {2013-01-01}, journal = {Lecture Notes in Business Information Processing}, volume = {138 LNBIP}, pages = {62-77}, abstract = {The increasing availability of large amounts of historical data and the need of performing accurate forecasting of future behavior in several scientific and applied domains demands the definition of robust and efficient techniques able to infer from observations the stochastic dependency between past and future. The forecasting domain has been influenced, from the 1960s on, by linear statistical methods such as ARIMA models. More recently, machine learning models have drawn attention and have established themselves as serious contenders to classical statistical models in the forecasting community. This chapter presents an overview of machine learning techniques in time series forecasting by focusing on three aspects: the formalization of one-step forecasting problems as supervised learning tasks, the discussion of local learning techniques as an effective tool for dealing with temporal data and the role of the forecasting strategy when we move from one-step to multiple-step forecasting. © 2013 Springer-Verlag.}, note = {DOI: 10.1007/978-3-642-36318-4_3}, keywords = {}, pubstate = {published}, tppubtype = {article} } The increasing availability of large amounts of historical data and the need of performing accurate forecasting of future behavior in several scientific and applied domains demands the definition of robust and efficient techniques able to infer from observations the stochastic dependency between past and future. The forecasting domain has been influenced, from the 1960s on, by linear statistical methods such as ARIMA models. More recently, machine learning models have drawn attention and have established themselves as serious contenders to classical statistical models in the forecasting community. This chapter presents an overview of machine learning techniques in time series forecasting by focusing on three aspects: the formalization of one-step forecasting problems as supervised learning tasks, the discussion of local learning techniques as an effective tool for dealing with temporal data and the role of the forecasting strategy when we move from one-step to multiple-step forecasting. © 2013 Springer-Verlag. |
Lopes, Miguel; Bontempi, Gianluca Experimental assessment of static and dynamic algorithms for gene regulation inference from time series expression data Journal Article In: Frontiers in Genetics, 4 (DEC), 2013, (DOI: 10.3389/fgene.2013.00303). @article{info:hdl:2013/168524, title = {Experimental assessment of static and dynamic algorithms for gene regulation inference from time series expression data}, author = {Miguel Lopes and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/168524}, year = {2013}, date = {2013-01-01}, journal = {Frontiers in Genetics}, volume = {4}, number = {DEC}, abstract = {Accurate inference of causal gene regulatory networks from gene expression data is an open bioinformatics challenge. Gene interactions are dynamical processes and consequently we can expect that the effect of any regulation action occurs after a certain temporal lag. However such lag is unknown a priori and temporal aspects require specific inference algorithms. In this paper we aim to assess the impact of taking into consideration temporal aspects on the final accuracy of the inference procedure. In particular we will compare the accuracy of static algorithms, where no dynamic aspect is considered, to that of fixed lag and adaptive lag algorithms in three inference tasks from microarray expression data. Experimental results show that network inference algorithms that take dynamics into account perform consistently better than static ones, once the considered lags are properly chosen. However, no individual algorithm stands out in all three inference tasks, and the challenging nature of network inference tasks is evidenced, as a large number of the assessed algorithms does not perform better than random. © 2013 Lopes and Bontempi.}, note = {DOI: 10.3389/fgene.2013.00303}, keywords = {}, pubstate = {published}, tppubtype = {article} } Accurate inference of causal gene regulatory networks from gene expression data is an open bioinformatics challenge. Gene interactions are dynamical processes and consequently we can expect that the effect of any regulation action occurs after a certain temporal lag. However such lag is unknown a priori and temporal aspects require specific inference algorithms. In this paper we aim to assess the impact of taking into consideration temporal aspects on the final accuracy of the inference procedure. In particular we will compare the accuracy of static algorithms, where no dynamic aspect is considered, to that of fixed lag and adaptive lag algorithms in three inference tasks from microarray expression data. Experimental results show that network inference algorithms that take dynamics into account perform consistently better than static ones, once the considered lags are properly chosen. However, no individual algorithm stands out in all three inference tasks, and the challenging nature of network inference tasks is evidenced, as a large number of the assessed algorithms does not perform better than random. © 2013 Lopes and Bontempi. |
Lerman, Liran; Markowitch, Olivier; Bontempi, Gianluca; Taieb, Souhaib Ben A time series approach for profiling attack Journal Article In: Lecture notes in computer science, 8204 , pp. 75-94, 2013, (DOI: 10.1007/978-3-642-41224-0_7). @article{info:hdl:2013/183221, title = {A time series approach for profiling attack}, author = {Liran Lerman and Olivier Markowitch and Gianluca Bontempi and Souhaib Ben Taieb}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/183221}, year = {2013}, date = {2013-01-01}, journal = {Lecture notes in computer science}, volume = {8204}, pages = {75-94}, abstract = {The goal of a profiling attack is to challenge the security of a cryptographic device in the worst case scenario. Though template attack is reputed as the strongest power analysis attack, they effectiveness is strongly dependent on the validity of the Gaussian assumption. This led recently to the appearance of nonparametric approaches, often based on machine learning strategies. Though these approaches outperform template attack, they tend to neglect the potential source of information available in the temporal dependencies between power values. In this paper, we propose an original multi-class profiling attack that takes into account the temporal dependence of power traces. The experimental study shows that the time series analysis approach is competitive and often better than static classification alternatives. © 2013 Springer-Verlag.}, note = {DOI: 10.1007/978-3-642-41224-0_7}, keywords = {}, pubstate = {published}, tppubtype = {article} } The goal of a profiling attack is to challenge the security of a cryptographic device in the worst case scenario. Though template attack is reputed as the strongest power analysis attack, they effectiveness is strongly dependent on the validity of the Gaussian assumption. This led recently to the appearance of nonparametric approaches, often based on machine learning strategies. Though these approaches outperform template attack, they tend to neglect the potential source of information available in the temporal dependencies between power values. In this paper, we propose an original multi-class profiling attack that takes into account the temporal dependence of power traces. The experimental study shows that the time series analysis approach is competitive and often better than static classification alternatives. © 2013 Springer-Verlag. |
Pozzolo, Andrea Dal; Caelen, Olivier; Waterschoot, Serge; Bontempi, Gianluca Racing for unbalanced methods selection Journal Article In: Lecture notes in computer science, 8206 LNCS , pp. 24-31, 2013, (DOI: 10.1007/978-3-642-41278-3_4). @article{info:hdl:2013/168898, title = {Racing for unbalanced methods selection}, author = {Andrea Dal Pozzolo and Olivier Caelen and Serge Waterschoot and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/168898}, year = {2013}, date = {2013-01-01}, journal = {Lecture notes in computer science}, volume = {8206 LNCS}, pages = {24-31}, abstract = {State-of-the-art classification algorithms suffer when the data is skewed towards one class. This led to the development of a number of techniques to cope with unbalanced data. However, as confirmed by our experimental comparison, no technique appears to work consistently better in all conditions. We propose to use a racing method to select adaptively the most appropriate strategy for a given unbalanced task. The results show that racing is able to adapt the choice of the strategy to the specific nature of the unbalanced problem and to select rapidly the most appropriate strategy without compromising the accuracy. © 2013 Springer-Verlag.}, note = {DOI: 10.1007/978-3-642-41278-3_4}, keywords = {}, pubstate = {published}, tppubtype = {article} } State-of-the-art classification algorithms suffer when the data is skewed towards one class. This led to the development of a number of techniques to cope with unbalanced data. However, as confirmed by our experimental comparison, no technique appears to work consistently better in all conditions. We propose to use a racing method to select adaptively the most appropriate strategy for a given unbalanced task. The results show that racing is able to adapt the choice of the strategy to the specific nature of the unbalanced problem and to select rapidly the most appropriate strategy without compromising the accuracy. © 2013 Springer-Verlag. |
Jan, Serge Van Sint; Wermenbol, Vanessa; Bogaert, Patrick Van; Desloovere, Kaat; Degelaen, Marc; Dan, Bernard; Salvia, P; Bonnechere, Bruno; "e, Yann-A; Bontempi, Gianluca; Vansummeren, Stijn; Sholukha, Victor; Moiseev, Fedor; Rooze, Marcel In: Médecine, 29 (5), pp. 529-536, 2013, (Language of publication: na). @article{info:hdl:2013/185849, title = {Recherche intégrée relative `a l’appareil musculosquelettique : application `a la prise en charge clinique de l’infirmité motrice cérébrale (IMC) – le projet ICT4Rehab}, author = {Serge Van Sint Jan and Vanessa Wermenbol and Patrick Van Bogaert and Kaat Desloovere and Marc Degelaen and Bernard Dan and P Salvia and Bruno Bonnechere and Yann-A{"e}l Leborgne and Gianluca Bontempi and Stijn Vansummeren and Victor Sholukha and Fedor Moiseev and Marcel Rooze}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/185849}, year = {2013}, date = {2013-01-01}, journal = {Médecine}, volume = {29}, number = {5}, pages = {529-536}, note = {Language of publication: na}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Lerat, Jean-Sébastien; Han, The Anh T A H; Lenaerts, Tom Evolution of Common-Pool Resources and Social Welfare in Structured Populations Inproceedings In: Joint Conference on Artificial Intelligence: Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2848-2854, Association for the Advancement of Artificial Intelligence, 2013, (Conference: (August 3-9 2013: Beijing)). @inproceedings{info:hdl:2013/149489, title = {Evolution of Common-Pool Resources and Social Welfare in Structured Populations}, author = {Jean-Sébastien Lerat and The Anh T A H Han and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/149489/1/IJCAI13-419.pdf}, year = {2013}, date = {2013-01-01}, booktitle = {Joint Conference on Artificial Intelligence: Twenty-Third International Joint Conference on Artificial Intelligence}, pages = {2848-2854}, publisher = {Association for the Advancement of Artificial Intelligence}, series = {Agent-Based and Multiagent Systems}, abstract = {The Common-pool resource (CPR) game is a social dilemma where agents have to decide how to consume a shared CPR. Either they each take their cut, completely destroying the CPR, or they restrain themselves, gaining less immediate profit but sustaining the resource and future profit. When no consumption takes place the CPR simply grows to its carrying capacity. As such, this dilemma provides a framework to study the evolution of social consumption strategies and the sustainability of resources, whose size adjusts dynamically through consumption and their own implicit population dynamics. The present study provides for the first time a detailed analysis of the evolutionary dynamics of consumption strategies in finite populations, focusing on the interplay between the resource levels and preferred consumption strategies. We show analytically which restrained consumers survive in relation to the growth rate of the resources and how this affects the resources' carrying capacity. Second, we show that population structures affect the sustainability of the resources and social welfare in the population. Current results provide an initial insight into the complexity of the CPR game, showing potential for a variety of different studies in the context of social welfare and resource sustainability.}, note = {Conference: (August 3-9 2013: Beijing)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } The Common-pool resource (CPR) game is a social dilemma where agents have to decide how to consume a shared CPR. Either they each take their cut, completely destroying the CPR, or they restrain themselves, gaining less immediate profit but sustaining the resource and future profit. When no consumption takes place the CPR simply grows to its carrying capacity. As such, this dilemma provides a framework to study the evolution of social consumption strategies and the sustainability of resources, whose size adjusts dynamically through consumption and their own implicit population dynamics. The present study provides for the first time a detailed analysis of the evolutionary dynamics of consumption strategies in finite populations, focusing on the interplay between the resource levels and preferred consumption strategies. We show analytically which restrained consumers survive in relation to the growth rate of the resources and how this affects the resources' carrying capacity. Second, we show that population structures affect the sustainability of the resources and social welfare in the population. Current results provide an initial insight into the complexity of the CPR game, showing potential for a variety of different studies in the context of social welfare and resource sustainability. |
Han, The Anh T A H; Pereira, Luís Moniz; Santos, F C; Lenaerts, Tom Why is it so hard to say sorry: evolution of apology with commitment in the iterated Prisoner’s Dilemma Inproceedings In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2013, (Language of publication: na). @inproceedings{info:hdl:2013/155970, title = {Why is it so hard to say sorry: evolution of apology with commitment in the iterated Prisoner’s Dilemma}, author = {The Anh T A H Han and Luís Moniz Pereira and F C Santos and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/155970}, year = {2013}, date = {2013-01-01}, booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)}, note = {Language of publication: na}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Bonnechere, Bruno; Wermenbol, Vanessa; Dan, Bernard; Salvia, Patrick; "e, Yann-A; Bontempi, Gianluca; Vansummeren, Stijn; Sholukha, Victor; Moiseev, Fedor; Jansen, Bart; Rooze, Marcel; Jan, Serge Van Sint Management and interpretation of medical data related to Cerebral Palsy : the ICT4Rehab project. Inproceedings In: EPNS, 2013, (Conference: (2013: Brussels, Belgium)). @inproceedings{info:hdl:2013/151891, title = {Management and interpretation of medical data related to Cerebral Palsy : the ICT4Rehab project.}, author = {Bruno Bonnechere and Vanessa Wermenbol and Bernard Dan and Patrick Salvia and Yann-A{"e}l Le Borgne and Gianluca Bontempi and Stijn Vansummeren and Victor Sholukha and Fedor Moiseev and Bart Jansen and Marcel Rooze and Serge Van Sint Jan}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/151891}, year = {2013}, date = {2013-01-01}, booktitle = {EPNS}, note = {Conference: (2013: Brussels, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
"e, Yann-A; Bontempi, Gianluca; Bonnechere, Bruno; Salvia, Patrick; Degelaen, Marc; Jan, Serge Van Sint Data mining toolbox for gait analysis in children with cerebral palsy Inproceedings In: ISB, 2013, (Conference: (2013: Natal, Brazil)). @inproceedings{info:hdl:2013/151884, title = {Data mining toolbox for gait analysis in children with cerebral palsy}, author = {Yann-A{"e}l Le Borgne and Gianluca Bontempi and Bruno Bonnechere and Patrick Salvia and Marc Degelaen and Serge Van Sint Jan}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/151884}, year = {2013}, date = {2013-01-01}, booktitle = {ISB}, note = {Conference: (2013: Natal, Brazil)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Jan, Serge Van Sint; Wermenbol, Vanessa; Dan, Bernard; Salvia, Patrick; Bonnechere, Bruno; "e, Yann-A; Bontempi, Gianluca; Vansummeren, Stijn; Sholukha, Victor; Moiseev, Fedor; Jansen, Bart; Rooze, Marcel In: SOFAMEA, 2013, (Conference: (2013: Luxembourg, Luxembourg)). @inproceedings{info:hdl:2013/151876, title = {Développement d'instruments partagés pour la gestion et l'interprétation de données relatives `a la locomoation : le projet ICT4Rehab}, author = {Serge Van Sint Jan and Vanessa Wermenbol and Bernard Dan and Patrick Salvia and Bruno Bonnechere and Yann-A{"e}l Le Borgne and Gianluca Bontempi and Stijn Vansummeren and Victor Sholukha and Fedor Moiseev and Bart Jansen and Marcel Rooze}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/151876}, year = {2013}, date = {2013-01-01}, booktitle = {SOFAMEA}, note = {Conference: (2013: Luxembourg, Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
"e, Yann-A; Bonnechere, Bruno; Salvia, Patrick; Jan, Serge Van Sint; Bontempi, Gianluca Fouille de donnée pour l'analyse de la marche de patients atteints d'infirmité motrice cérébrale (ICT4Rehab) Inproceedings In: SOFAMEA, 2013, (Conference: (2013: Luxembourg, Luxembourg)). @inproceedings{info:hdl:2013/151877, title = {Fouille de donnée pour l'analyse de la marche de patients atteints d'infirmité motrice cérébrale (ICT4Rehab)}, author = {Yann-A{"e}l Le Borgne and Bruno Bonnechere and Patrick Salvia and Serge Van Sint Jan and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/151877}, year = {2013}, date = {2013-01-01}, booktitle = {SOFAMEA}, note = {Conference: (2013: Luxembourg, Luxembourg)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Gagliolo, Matteo; Lenaerts, Tom; Jacobs, Dirk A comparative analysis of the dynamics of interlocking directorates among immigrant organizations Miscellaneous 2013, (Conference: Belgian Social and Economic Network Research Meeting (BSEN) (3: 2013-10-03: Leuven)). @misc{info:hdl:2013/150673, title = {A comparative analysis of the dynamics of interlocking directorates among immigrant organizations}, author = {Matteo Gagliolo and Tom Lenaerts and Dirk Jacobs}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/150673/1/Gagliolo2013BSEN.pdf}, year = {2013}, date = {2013-01-01}, note = {Conference: Belgian Social and Economic Network Research Meeting (BSEN) (3: 2013-10-03: Leuven)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Gagliolo, Matteo; Lenaerts, Tom; Jacobs, Dirk A comparative analysis of the dynamics of interlock networks immigrant organizations Miscellaneous 2013, (Conference: ECPR General Conference (7: 2013-09-07: Bordeaux, France)). @misc{info:hdl:2013/150672, title = {A comparative analysis of the dynamics of interlock networks immigrant organizations}, author = {Matteo Gagliolo and Tom Lenaerts and Dirk Jacobs}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/150672/1/Gagliolo2013ECPR.pdf}, year = {2013}, date = {2013-01-01}, abstract = {Social capital is naturally embedded in social networks. In his famouswork on the causal relationship between ”bridging” social capital (e.g.,associational life), trust, and civic behavior, Putnam [1993] did not inves-tigate the structural aspect of such networks. Recently, the relationshipbetween associational life and civicness of ethnic minority groups in Eu-rope has been investigated [Fennema and Tillie, 2001; Jacobs et al., 2004;Vermeulen and Berger, 2008], without reaching uniform conclusions. Inthis work, simple structural properties of the network of interlocking di-rectorates among ethnic associations are used as a proxy of the socialcapital of the corresponding minority group. We pursue this line further,arguing that more advanced models may consistently predict differencesamong the studied communities, and look at the structure of such net-works,but also at the dynamics that produced it. Here we present resultswith a stochastic actor-based model, SIENA [Snijders et al., 2010], whichestimates the effect of actor covariates and local structure on networkevolution. We model the dynamics of the full two-mode network amongdirectors and boards of voluntary associations, including the structural ef-fects proposed by [Koskinen and Edling, 2012], and considering the politi-cal orientation of associations as a covariate. Using data from [Vermeulenand Berger, 2008], we compare the evolution of interlocks among Turkishassociations in two European capitals, and explain the noticeable differ-ence in structure by looking at statistically significant differences amongthe estimated effects. In the longer term we intend to relate the dynamicsof these networks to the civic behavior of the corresponding communities.}, note = {Conference: ECPR General Conference (7: 2013-09-07: Bordeaux, France)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } Social capital is naturally embedded in social networks. In his famouswork on the causal relationship between ”bridging” social capital (e.g.,associational life), trust, and civic behavior, Putnam [1993] did not inves-tigate the structural aspect of such networks. Recently, the relationshipbetween associational life and civicness of ethnic minority groups in Eu-rope has been investigated [Fennema and Tillie, 2001; Jacobs et al., 2004;Vermeulen and Berger, 2008], without reaching uniform conclusions. Inthis work, simple structural properties of the network of interlocking di-rectorates among ethnic associations are used as a proxy of the socialcapital of the corresponding minority group. We pursue this line further,arguing that more advanced models may consistently predict differencesamong the studied communities, and look at the structure of such net-works,but also at the dynamics that produced it. Here we present resultswith a stochastic actor-based model, SIENA [Snijders et al., 2010], whichestimates the effect of actor covariates and local structure on networkevolution. We model the dynamics of the full two-mode network amongdirectors and boards of voluntary associations, including the structural ef-fects proposed by [Koskinen and Edling, 2012], and considering the politi-cal orientation of associations as a covariate. Using data from [Vermeulenand Berger, 2008], we compare the evolution of interlocks among Turkishassociations in two European capitals, and explain the noticeable differ-ence in structure by looking at statistically significant differences amongthe estimated effects. In the longer term we intend to relate the dynamicsof these networks to the civic behavior of the corresponding communities. |
Lerat, Jean-Sébastien; Han, The Anh T A H; Lenaerts, Tom Joint Conference on Artificial Intelligence: Twenty-Third International Joint Conference on Artificial Intelligence Miscellaneous 2013, (Conference: IJCAI (23: 3-7/08/2013: Beijing)). @misc{info:hdl:2013/150431, title = {Joint Conference on Artificial Intelligence: Twenty-Third International Joint Conference on Artificial Intelligence}, author = {Jean-Sébastien Lerat and The Anh T A H Han and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/150431/1/IJCAI13-Poster-JS.pdf}, year = {2013}, date = {2013-01-01}, abstract = {The Common-pool resource (CPR) game is a social dilemma where agents have to decide how to consume a shared CPR. Either they each take their cut, completely destroying the CPR, or they restrain themselves, gaining less immediate profit but sustaining the resource and future profit. When no consumption takes place the CPR simply grows to its carrying capacity. As such, this dilemma provides a framework to study the evolution of social consumption strategies and the sustainability of resources, whose size adjusts dynamically through consumption and their own implicit population dynamics. The present study provides for the first time a detailed analysis of the evolutionary dynamics of consumption strategies in finite populations, focusing on the interplay between the resource levels and preferred consumption strategies. We show analytically which restrained consumers survive in relation to the growth rate of the resources and how this affects the resources' carrying capacity. Second, we show that population structures affect the sustainability of the resources and social welfare in the population. Current results provide an initial insight into the complexity of the CPR game, showing potential for a variety of different studies in the context of social welfare and resource sustainability.}, note = {Conference: IJCAI (23: 3-7/08/2013: Beijing)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } The Common-pool resource (CPR) game is a social dilemma where agents have to decide how to consume a shared CPR. Either they each take their cut, completely destroying the CPR, or they restrain themselves, gaining less immediate profit but sustaining the resource and future profit. When no consumption takes place the CPR simply grows to its carrying capacity. As such, this dilemma provides a framework to study the evolution of social consumption strategies and the sustainability of resources, whose size adjusts dynamically through consumption and their own implicit population dynamics. The present study provides for the first time a detailed analysis of the evolutionary dynamics of consumption strategies in finite populations, focusing on the interplay between the resource levels and preferred consumption strategies. We show analytically which restrained consumers survive in relation to the growth rate of the resources and how this affects the resources' carrying capacity. Second, we show that population structures affect the sustainability of the resources and social welfare in the population. Current results provide an initial insight into the complexity of the CPR game, showing potential for a variety of different studies in the context of social welfare and resource sustainability. |
Gagliolo, Matteo; Lenaerts, Tom; Jacobs, Dirk A comparative analysis of the dynamics of interlocks among immigrant organizations Miscellaneous 2013, (Conference: Sunbelt, Social Networks Conference of the International Network for Social Network Analysis (INSNA) (XXXIII: 2013-05-23: Hamburg, Germany)). @misc{info:hdl:2013/150671, title = {A comparative analysis of the dynamics of interlocks among immigrant organizations}, author = {Matteo Gagliolo and Tom Lenaerts and Dirk Jacobs}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/150671/1/Gagliolo2013Sunbelt.pdf}, year = {2013}, date = {2013-01-01}, note = {Conference: Sunbelt, Social Networks Conference of the International Network for Social Network Analysis (INSNA) (XXXIII: 2013-05-23: Hamburg, Germany)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F DynaMine: Sequence-based Protein Backbone Dynamics and Disorder Prediction Miscellaneous 2013, (Conference: 21st Annual International Conference on Intelligent Systems for Molecular Biology and the 12th European Conference on Computational Biology(19-23 july 2013: Berlin, Germany)). @misc{info:hdl:2013/243660, title = {DynaMine: Sequence-based Protein Backbone Dynamics and Disorder Prediction}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim F Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243660}, year = {2013}, date = {2013-01-01}, note = {Conference: 21st Annual International Conference on Intelligent Systems for Molecular Biology and the 12th European Conference on Computational Biology(19-23 july 2013: Berlin, Germany)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Huculeci, Radu; Cilia, Elisa; Buts, Lieven; Houben, Klaartje; van Nuland, Nico A J; Lenaerts, Tom Unravelling the intra-protein communication pathway within the Fyn SH2 domain Miscellaneous 2013, (Conference: 12th Meeting of Young Belgian Magnetic Resonance Scientist(2-3 December 2013: Blankenberge, Belgium)). @misc{info:hdl:2013/243665, title = {Unravelling the intra-protein communication pathway within the Fyn SH2 domain}, author = {Radu Huculeci and Elisa Cilia and Lieven Buts and Klaartje Houben and Nico A J van Nuland and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243665}, year = {2013}, date = {2013-01-01}, note = {Conference: 12th Meeting of Young Belgian Magnetic Resonance Scientist(2-3 December 2013: Blankenberge, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Cilia, Elisa; Vuister, Geerten W; Lenaerts, Tom Accurate prediction of peptide-induced dynamical changes within the second PDZ domain of PTP1e Miscellaneous 2013, (Conference: 21st Annual International Conference on Intelligent Systems for Molecular Biology and the 12th European Conference on Computational Biology(19-23 July 2013: Berlin, Germany)). @misc{info:hdl:2013/243659, title = {Accurate prediction of peptide-induced dynamical changes within the second PDZ domain of PTP1e}, author = {Elisa Cilia and Geerten W Vuister and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243659}, year = {2013}, date = {2013-01-01}, note = {Conference: 21st Annual International Conference on Intelligent Systems for Molecular Biology and the 12th European Conference on Computational Biology(19-23 July 2013: Berlin, Germany)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim DynaMine: From protein sequence to dynamics and disorder Miscellaneous 2013, (Conference: 8th Benelux Bioinformatics Conference(9-10 December 2013: Brussels, belgium)). @misc{info:hdl:2013/243662, title = {DynaMine: From protein sequence to dynamics and disorder}, author = {Elisa Cilia and Rita Pancsa and Peter Tompa and Tom Lenaerts and Wim Vranken}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243662}, year = {2013}, date = {2013-01-01}, note = {Conference: 8th Benelux Bioinformatics Conference(9-10 December 2013: Brussels, belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Huculeci, Radu; Cilia, Elisa; Buts, Lieven; Houben, Klaartje; van Nuland, Nico A J; Lenaerts, Tom Mapping intra-protein communication – The Fyn SH2 snap-lock mechanisms Miscellaneous 2013, (Conference: 8th Benelux Bioinformatics Conference(9-10 December 2013: Brussels, Belgium)). @misc{info:hdl:2013/243661, title = {Mapping intra-protein communication – The Fyn SH2 snap-lock mechanisms}, author = {Radu Huculeci and Elisa Cilia and Lieven Buts and Klaartje Houben and Nico A J van Nuland and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243661}, year = {2013}, date = {2013-01-01}, note = {Conference: 8th Benelux Bioinformatics Conference(9-10 December 2013: Brussels, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Han, The Anh T A H; Pereira, Luís Moniz; Santos, Francisco C; Lenaerts, Tom Why is it so hard to say sorry ? Miscellaneous 2013, (Conference: 25th Benelux Artificial Intelligence Conference(7-8 November 2013: Deflt, the Netherlands)). @misc{info:hdl:2013/243663, title = {Why is it so hard to say sorry ?}, author = {The Anh T A H Han and Luís Moniz Pereira and Francisco C Santos and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243663}, year = {2013}, date = {2013-01-01}, note = {Conference: 25th Benelux Artificial Intelligence Conference(7-8 November 2013: Deflt, the Netherlands)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Conard, Ashley M; Cilia, Elisa; Lenaerts, Tom Cooperative game theory and feature selection to describe dependencies between amino acid within the SH3 Fyn protein domain Miscellaneous 2013, (Conference: The Grace Hopper Celebration (GHC) of Women in Computing(2-5 October 2013: Minneapolis, US)). @misc{info:hdl:2013/243664, title = {Cooperative game theory and feature selection to describe dependencies between amino acid within the SH3 Fyn protein domain}, author = {Ashley M Conard and Elisa Cilia and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243664}, year = {2013}, date = {2013-01-01}, note = {Conference: The Grace Hopper Celebration (GHC) of Women in Computing(2-5 October 2013: Minneapolis, US)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Rubio, Lucia; Huculeci, Radu; Vanwetswinkel, Sophie; Buts, Lieven; Lenaerts, Tom; van Nuland, Nico A J Unravelling the allosteric effects of SHP2’s SH2 domains in the through NMR relaxation experiments Miscellaneous 2013, (Conference: 12th Meeting of Young Belgian Magnetic Resonance Scientist(2-3 december 2013: Blankenberge, Belgium)). @misc{info:hdl:2013/243666, title = {Unravelling the allosteric effects of SHP2’s SH2 domains in the through NMR relaxation experiments}, author = {Lucia Rubio and Radu Huculeci and Sophie Vanwetswinkel and Lieven Buts and Tom Lenaerts and Nico A J van Nuland}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243666}, year = {2013}, date = {2013-01-01}, note = {Conference: 12th Meeting of Young Belgian Magnetic Resonance Scientist(2-3 december 2013: Blankenberge, Belgium)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Antoine, Dewilde A D; Lenaerts, Tom Infering Conditional Behaviors Using Probabilistic Models. Artificial Intelligence Masters Thesis 2013, (Language of publication: en). @mastersthesis{info:hdl:2013/150430, title = {Infering Conditional Behaviors Using Probabilistic Models. Artificial Intelligence}, author = {Dewilde A D Antoine and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/150430/1/MasterThesis201213-1.pdf}, year = {2013}, date = {2013-01-01}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Marillet, Simon; Lenaerts, Tom Evaluation of intra-protein co-evolution prediction methods. Informatique Masters Thesis 2013, (Language of publication: fr). @mastersthesis{info:hdl:2013/243954, title = {Evaluation of intra-protein co-evolution prediction methods. Informatique}, author = {Simon Marillet and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243954}, year = {2013}, date = {2013-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Olsen, Catharina; Bontempi, Gianluca; Quackenbush, John; Haibe-Kains, Benjamin Data-driven validation of gene regulatory networks using knock-down data Miscellaneous 2013, (Conference: 8th Benelux Bioinformatics Conference (BBC13)). @misc{info:hdl:2013/155454, title = {Data-driven validation of gene regulatory networks using knock-down data}, author = {Catharina Olsen and Gianluca Bontempi and John Quackenbush and Benjamin Haibe-Kains}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/155454}, year = {2013}, date = {2013-01-01}, note = {Conference: 8th Benelux Bioinformatics Conference (BBC13)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Vion, Jérémy; Lenaerts, Tom The effects of matching algorithms on the level of fairness in the ultimatum game. Informatique Masters Thesis 2013, (Language of publication: fr). @mastersthesis{info:hdl:2013/243953, title = {The effects of matching algorithms on the level of fairness in the ultimatum game. Informatique}, author = {Jérémy Vion and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243953}, year = {2013}, date = {2013-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Olsen, Catharina; Bontempi, Gianluca; Quackenbush, John; Haibe-Kains, Benjamin Data-driven validation of gene regulatory networks using knock-down data Miscellaneous 2013, (Conference: 8th Benelux Bioinformatics Conference (BBC13)). @misc{info:hdl:2013/245104, title = {Data-driven validation of gene regulatory networks using knock-down data}, author = {Catharina Olsen and Gianluca Bontempi and John Quackenbush and Benjamin Haibe-Kains}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/245104}, year = {2013}, date = {2013-01-01}, note = {Conference: 8th Benelux Bioinformatics Conference (BBC13)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Lerat, Jean-Sébastien; Lenaerts, Tom Exploring Common-pool resources with natural resource dynamics. Informatique Masters Thesis 2013, (Language of publication: fr). @mastersthesis{info:hdl:2013/243952, title = {Exploring Common-pool resources with natural resource dynamics. Informatique}, author = {Jean-Sébastien Lerat and Tom Lenaerts}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/243952}, year = {2013}, date = {2013-01-01}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } |
Trepo, Eric; Nahon, Pierre; Bontempi, Gianluca; Valenti, Luca; Falleti, Edmondo; Nischalke, Hans Dieter; Hamza, Samia; Corradini, Stefano Ginanni; Burza, Maria Antonella; Guyot, Erwan; Donati, Benedetta; Spengler, Ulrich; Hillon, Patrick; Toniutto, Pierluigi; Henrion, Jean; Mathurin, Philippe; Moreno, Christophe; Romeo, Stefano; Deltenre, Pierre 2013, (Conference: The Liver Meeting, American Association for the Study of Liver Diseases(2013: Washington, USA)). @misc{info:hdl:2013/225572, title = {Association between the PNPLA3 (rs738409 C>G) variant and hepatocellular carcinoma: evidence from a meta-analysis of individual participant data}, author = {Eric Trepo and Pierre Nahon and Gianluca Bontempi and Luca Valenti and Edmondo Falleti and Hans Dieter Nischalke and Samia Hamza and Stefano Ginanni Corradini and Maria Antonella Burza and Erwan Guyot and Benedetta Donati and Ulrich Spengler and Patrick Hillon and Pierluigi Toniutto and Jean Henrion and Philippe Mathurin and Christophe Moreno and Stefano Romeo and Pierre Deltenre}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/225572}, year = {2013}, date = {2013-01-01}, note = {Conference: The Liver Meeting, American Association for the Study of Liver Diseases(2013: Washington, USA)}, keywords = {}, pubstate = {published}, tppubtype = {misc} } |
Pini, Giovanni Towards autonomous task partitioning in swarm robotics: experiments with foraging robots PhD Thesis 2013, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209469, title = {Towards autonomous task partitioning in swarm robotics: experiments with foraging robots}, author = {Giovanni Pini}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209469/1/7a3641f3-3f2f-441d-b343-77a304e6d1f9.txt}, year = {2013}, date = {2013-01-01}, abstract = {In this thesis, we propose an approach to achieve autonomous task partitioning in swarms of robots. Task partitioning is the process by which tasks are decomposed into sub-tasks and it is often an advantageous way of organizing work in groups of individuals. Therefore, it is interesting to study its application to swarm robotics, in which groups of robots are deployed to collectively carry out a mission. The capability of partitioning tasks autonomously can enhance the flexibility of swarm robotics systems because the robots can adapt the way they decompose and perform their work depending on specific environmental conditions and goals. So far, few studies have been presented on the topic of task partitioning in the context of swarm robotics. Additionally, in all the existing studies, there is no separation between the task partitioning methods and the behavior of the robots and often task partitioning relies on characteristics of the environments in which the robots operate. This limits the applicability of these methods to the specific contexts for which they have been built. The work presented in this thesis represents the first steps towards a general framework for autonomous task partitioning in swarms of robots. We study task partitioning in foraging, since foraging abstracts practical real-world problems. The approach we propose in this thesis is therefore studied in experiments in which the goal is to achieve autonomous task partitioning in foraging. However, in the proposed approach, the task partitioning process relies upon general, task-independent concepts and we are therefore confident that it is applicable in other contexts. We identify two main capabilities that the robots should have: i) being capable of selecting whether to employ task partitioning and ii) defining the sub-tasks of a given task. We propose and study algorithms that endow a swarm of robots with these capabilities.}, In this thesis, we propose an approach to achieve autonomous task partitioning in swarms of robots. Task partitioning is the process by which tasks are decomposed into sub-tasks and it is often an advantageous way of organizing work in groups of individuals. Therefore, it is interesting to study its application to swarm robotics, in which groups of robots are deployed to collectively carry out a mission. The capability of partitioning tasks autonomously can enhance the flexibility of swarm robotics systems because the robots can adapt the way they decompose and perform their work depending on specific environmental conditions and goals. So far, few studies have been presented on the topic of task partitioning in the context of swarm robotics. Additionally, in all the existing studies, there is no separation between the task partitioning methods and the behavior of the robots and often task partitioning relies on characteristics of the environments in which the robots operate.<p>This limits the applicability of these methods to the specific contexts for which they have been built. The work presented in this thesis represents the first steps towards a general framework for autonomous task partitioning in swarms of robots. We study task partitioning in foraging, since foraging abstracts practical real-world problems. The approach we propose in this thesis is therefore studied in experiments in which the goal is to achieve autonomous task partitioning in foraging. However, in the proposed approach, the task partitioning process relies upon general, task-independent concepts and we are therefore confident that it is applicable in other contexts. We identify two main capabilities that the robots should have: i) being capable of selecting whether to employ task partitioning and ii) defining the sub-tasks of a given task. We propose and study algorithms that endow a swarm of robots with these capabilities. |
2012 |
"e, Yann-A; Bontempi, Gianluca Prediction-Based Data Collection in Wireless Sensor Networks Book Chapter In: CRC Press, 2012, (Language of publication: fr). @inbook{info:hdl:2013/223375, title = {Prediction-Based Data Collection in Wireless Sensor Networks}, author = {Yann-A{"e}l Le Borgne and Gianluca Bontempi}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/223375}, year = {2012}, date = {2012-01-01}, publisher = {CRC Press}, series = {Intelligent Sensor Networks: The Integration of Sensor Networks, Signal Processing and Machine Learning}, note = {Language of publication: fr}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } |
Biggelaar, Olivier Van Den 2012, (Funder: Universite Libre de Bruxelles). @phdthesis{info:hdl:2013/209612, title = {Distributed spectrum sensing and interference management for cognitive radios with low capacity control channels}, author = {Olivier Van Den Biggelaar}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/209612/2/ec0b174d-348d-4a91-932c-8938e8731323.txt}, year = {2012}, date = {2012-01-01}, abstract = {Cognitive radios have been proposed as a new technology to counteract the spectrum scarcity issue and increase the spectral efficiency. In cognitive radios, the sparse assigned frequency bands are opened to secondary users, provided that interference induced on the primary licensees is negligible. Cognitive radios are established in two steps: the radios firstly sense the available frequency bands by detecting the presence of primary users and secondly communicate using the bands that have been identified as not in use by the primary users. In this thesis we investigate how to improve the efficiency of cognitive radio networks when multiple cognitive radios cooperate to sense the spectrum or control their interferences. A major challenge in the design of cooperating devices lays in the need for exchange of information between these devices. Therefore, in this thesis we identify three specific types of control information exchange whose efficiency can be improved. Specifically, we first study how cognitive radios can efficiently exchange sensing information with a coordinator node when the reporting channels are noisy. Then, we propose distributed learning algorithms allowing to allocate the primary network sensing times and the secondary transmission powers within the secondary network. Both distributed allocation algorithms minimize the need for information exchange compared to centralized allocation algorithms.}, Cognitive radios have been proposed as a new technology to counteract the spectrum scarcity issue and increase the spectral efficiency. In cognitive radios, the sparse assigned frequency bands are opened to secondary users, provided that interference induced on the primary licensees is negligible. Cognitive radios are established in two steps: the radios firstly sense the available frequency bands by detecting the presence of primary users and secondly communicate using the bands that have been identified as not in use by the primary users.<p><p>In this thesis we investigate how to improve the efficiency of cognitive radio networks when multiple cognitive radios cooperate to sense the spectrum or control their interferences. A major challenge in the design of cooperating devices lays in the need for exchange of information between these devices. Therefore, in this thesis we identify three specific types of control information exchange whose efficiency can be improved. Specifically, we first study how cognitive radios can efficiently exchange sensing information with a coordinator node when the reporting channels are noisy. Then, we propose distributed learning algorithms allowing to allocate the primary network sensing times and the secondary transmission powers within the secondary network. Both distributed allocation algorithms minimize the need for information exchange compared to centralized allocation algorithms. |
Cilia, Elisa; Vuister, Geerten W; Lenaerts, Tom Accurate prediction of the dynamical changes within the second PDZ domain of PTP1e. Journal Article In: PLoS computational biology, 8 (11), pp. e1002794, 2012, (DOI: 10.1371/journal.pcbi.1002794). @article{info:hdl:2013/138318, title = {Accurate prediction of the dynamical changes within the second PDZ domain of PTP1e.}, author = {Elisa Cilia and Geerten W Vuister and Tom Lenaerts}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/138318/1/PMC3510070.pdf}, year = {2012}, date = {2012-01-01}, journal = {PLoS computational biology}, volume = {8}, number = {11}, pages = {e1002794}, abstract = {Experimental NMR relaxation studies have shown that peptide binding induces dynamical changes at the side-chain level throughout the second PDZ domain of PTP1e, identifying as such the collection of residues involved in long-range communication. Even though different computational approaches have identified subsets of residues that were qualitatively comparable, no quantitative analysis of the accuracy of these predictions was thus far determined. Here, we show that our information theoretical method produces quantitatively better results with respect to the experimental data than some of these earlier methods. Moreover, it provides a global network perspective on the effect experienced by the different residues involved in the process. We also show that these predictions are consistent within both the human and mouse variants of this domain. Together, these results improve the understanding of intra-protein communication and allostery in PDZ domains, underlining at the same time the necessity of producing similar data sets for further validation of thses kinds of methods.}, note = {DOI: 10.1371/journal.pcbi.1002794}, keywords = {}, pubstate = {published}, tppubtype = {article} } Experimental NMR relaxation studies have shown that peptide binding induces dynamical changes at the side-chain level throughout the second PDZ domain of PTP1e, identifying as such the collection of residues involved in long-range communication. Even though different computational approaches have identified subsets of residues that were qualitatively comparable, no quantitative analysis of the accuracy of these predictions was thus far determined. Here, we show that our information theoretical method produces quantitatively better results with respect to the experimental data than some of these earlier methods. Moreover, it provides a global network perspective on the effect experienced by the different residues involved in the process. We also show that these predictions are consistent within both the human and mouse variants of this domain. Together, these results improve the understanding of intra-protein communication and allostery in PDZ domains, underlining at the same time the necessity of producing similar data sets for further validation of thses kinds of methods. |
Segbroeck, Sven Van; Pacheco, J M; Lenaerts, Tom; Santos, F C Emergence of fairness in repeated group interactions. Journal Article In: Physical review letters, 108 (15), pp. 158104, 2012, (Language of publication: en). @article{info:hdl:2013/133921, title = {Emergence of fairness in repeated group interactions.}, author = {Sven Van Segbroeck and J M Pacheco and Tom Lenaerts and F C Santos}, url = {http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/133921}, year = {2012}, date = {2012-01-01}, journal = {Physical review letters}, volume = {108}, number = {15}, pages = {158104}, abstract = {Often groups need to meet repeatedly before a decision is reached. Hence, most individual decisions will be contingent on decisions taken previously by others. In particular, the decision to cooperate or not will depend on one's own assessment of what constitutes a fair group outcome. Making use of a repeated N-person prisoner's dilemma, we show that reciprocation towards groups opens a window of opportunity for cooperation to thrive, leading populations to engage in dynamics involving both coordination and coexistence, and characterized by cycles of cooperation and defection. Furthermore, we show that this process leads to the emergence of fairness, whose level will depend on the dilemma at stake.}, note = {Language of publication: en}, keywords = {}, pubstate = {published}, tppubtype = {article} } Often groups need to meet repeatedly before a decision is reached. Hence, most individual decisions will be contingent on decisions taken previously by others. In particular, the decision to cooperate or not will depend on one's own assessment of what constitutes a fair group outcome. Making use of a repeated N-person prisoner's dilemma, we show that reciprocation towards groups opens a window of opportunity for cooperation to thrive, leading populations to engage in dynamics involving both coordination and coexistence, and characterized by cycles of cooperation and defection. Furthermore, we show that this process leads to the emergence of fairness, whose level will depend on the dilemma at stake. |
Huculeci, Radu; Buts, Lieven; Lenaerts, Tom; van Nuland, Nico A J; Garcia-Pino, Abel Purification, crystallization and preliminary X-ray diffraction analysis of the Fyn SH2 domain and its complex with a phosphotyrosine peptide. Journal Article In: Acta Crystallographica. Section F: Structural Biology and Crystallization Communications Online, 68 (Pt 3), pp. 359-364, 2012, (DOI: 10.1107/S1744309112004186). @article{info:hdl:2013/138059, title = {Purification, crystallization and preliminary X-ray diffraction analysis of the Fyn SH2 domain and its complex with a phosphotyrosine peptide.}, author = {Radu Huculeci and Lieven Buts and Tom Lenaerts and Nico A J van Nuland and Abel Garcia-Pino}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/138059/1/PMC3310552.pdf}, year = {2012}, date = {2012-01-01}, journal = {Acta Crystallographica. Section F: Structural Biology and Crystallization Communications Online}, volume = {68}, number = {Pt 3}, pages = {359-364}, abstract = {SH2 domains are widespread protein-binding modules that recognize phosphotyrosines and play central roles in intracellular signalling pathways. The SH2 domain of the human protein tyrosine kinase Fyn has been expressed, purified and crystallized in the unbound state and in complex with a high-affinity phosphotyrosine peptide. X-ray data were collected to a resolution of 2.00 Å for the unbound form and 1.40 Å for the protein in complex with the phosphotyrosine peptide.}, note = {DOI: 10.1107/S1744309112004186}, keywords = {}, pubstate = {published}, tppubtype = {article} } SH2 domains are widespread protein-binding modules that recognize phosphotyrosines and play central roles in intracellular signalling pathways. The SH2 domain of the human protein tyrosine kinase Fyn has been expressed, purified and crystallized in the unbound state and in complex with a high-affinity phosphotyrosine peptide. X-ray data were collected to a resolution of 2.00 Å for the unbound form and 1.40 Å for the protein in complex with the phosphotyrosine peptide. |
Haibe-Kains, Benjamin; Desmedt, Christine; Loi, Sherene; Culhane, Aedin C; Bontempi, Gianluca; Quackenbush, John; Sotiriou, Christos A three-gene model to robustly identify breast cancer molecular subtypes. Journal Article In: Journal of the National Cancer Institute, 104 (4), pp. 311-325, 2012, (DOI: 10.1093/jnci/djr545). @article{info:hdl:2013/135461, title = {A three-gene model to robustly identify breast cancer molecular subtypes.}, author = {Benjamin Haibe-Kains and Christine Desmedt and Sherene Loi and Aedin C Culhane and Gianluca Bontempi and John Quackenbush and Christos Sotiriou}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/135461/1/130.JNatlCancerInst.-BHK-2012.pdf}, year = {2012}, date = {2012-01-01}, journal = {Journal of the National Cancer Institute}, volume = {104}, number = {4}, pages = {311-325}, abstract = {Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets.}, note = {DOI: 10.1093/jnci/djr545}, keywords = {}, pubstate = {published}, tppubtype = {article} } Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets. |
Vaccaro, Alfredo A; Bontempi, Gianluca; Taieb, Souhaib Ben; Villacci, Domenico D Adaptive local learning techniques for multiple-step-ahead wind speed forecasting Journal Article In: Electric power systems research, 83 (1), pp. 129-135, 2012, (DOI: 10.1016/j.epsr.2011.10.008). @article{info:hdl:2013/109023, title = {Adaptive local learning techniques for multiple-step-ahead wind speed forecasting}, author = {Alfredo A Vaccaro and Gianluca Bontempi and Souhaib Ben Taieb and Domenico D Villacci}, url = {https://dipot.ulb.ac.be/dspace/bitstream/2013/109023/1/Elsevier_89123.pdf}, year = {2012}, date = {2012-01-01}, journal = {Electric power systems research}, volume = {83}, number = {1}, pages = {129-135}, abstract = {A massive deployment of wind energy in power systems is expected in the near future. However, a still open issue is how to integrate wind generators into existing electrical grids by limiting their side effects on network operations and control. In order to attain this objective, accurate short and medium-term wind speed forecasting is required. This paper discusses and compares a physical (white-box) model (namely a limited-area non hydrostatic model developed by the European consortium for small-scale modeling) with a family of local learning techniques (black-box) for short and medium term forecasting. Also, an original model integrating machine learning techniques with physical knowledge modeling (grey-box) is proposed. A set of experiments on real data collected from a set of meteorological sensors located in the south of Italy supports the methodological analysis and assesses the potential of the different forecasting approaches. © 2011 Elsevier B.V. All rights reserved.}, note = {DOI: 10.1016/j.epsr.2011.10.008}, keywords = {}, pubstate = {published}, tppubtype = {article} } A massive deployment of wind energy in power systems is expected in the near future. However, a still open issue is how to integrate wind generators into existing electrical grids by limiting their side effects on network operations and control. In order to attain this objective, accurate short and medium-term wind speed forecasting is required. This paper discusses and compares a physical (white-box) model (namely a limited-area non hydrostatic model developed by the European consortium for small-scale modeling) with a family of local learning techniques (black-box) for short and medium term forecasting. Also, an original model integrating machine learning techniques with physical knowledge modeling (grey-box) is proposed. A set of experiments on real data collected from a set of meteorological sensors located in the south of Italy supports the methodological analysis and assesses the potential of the different forecasting approaches. © 2011 Elsevier B.V. All rights reserved. |