2013
-
A gradient boosting approach to the Kaggle load forecasting competition.
by: Souhaib Ben Taieb
Machine Learning Group (MLG), Belgium
When:
28 March 2013 at 12:30 (until 13:30)
Where: La Plain Campus, Building NO, 5th floor, Solvay Room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
We describe and analyze the approach used by Team TinTin (Souhaib Ben Taieb and Rob J. Hyndman) in the Load Forecasting track of the Kaggle Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem for a US utility with 20 geographical zones. Eight in-sample and one out-of-sample weeks for 21 separate time series need to be backcast and forecast, respectively. The electricity demand for the next day is forecasted using a separate model for each hourly period. We use component-wise gradient boosting to estimate each hourly model with univariate penalized regression splines as base learners. The models allow for the electricity demand to change with time-of-year, day-of-week, time-of-day, and on public holidays with the main predictors being current and past temperatures as well as past demand. Our model ranked fifth out of 105 participating teams.
-
Machine learning in DG Sanco - EC : database clean up, fraud detection and web semantic.-- Slides
by: Philippe Loopuyt and Eric Ngantchjon
European Commission
When:
9 January 2013 at 3.15PM (until 4.15PM including questions)
Where: La Plain Campus, Building NO, 8th floor, Rotule room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
The last years, DG Sanco has used some Machine learning techniques for diverse projects. With the increase of data received from different actors and the tremendous number of applications having related information, there is a need to define common entities and clean up databases with duplicate records. An application for the detection of duplicate records and the the generation of clusters of similar records has been developped. The algorithms are based on Levenshtein distance as text metric and K-Means as clustering. An expert system application has been developed in order to help the BIP (Border Inspection Point) to improve the random control for the trade of animals and products among Europe and also between MS and third countries. Based on historical values of fraudulent consignments, patterns are found and predictive models are built to check future consignments. Predictive models are built with KXEN, a machine learning software based on the 'Vapnik-Chervonenkis' theory. With the web semanic project, the objective is to publish public data in a semantic format; furthermore, there are more and more external users asking for an API allowing them to get the public data in an automatic way in such a way that it will avoid the long process of manually updating their own system. The web semantic responds efficiently to these requirements; it is a global concept allowing the existence of links on distributed data spread over the web. Besides there are some protocols used to query and access these data. But one of the big challenges of web semantic is to find a way to set automatically (or at least semi-automatically) the links with unstructured data, and this leads to the use of technics such as text mining, taxonomy and ontology.
Speaker:
Philippe Loopuyt, Head of Unit Information Systems, Directorate General Health and Consumers, European Commission. Eric Ngantchjon, graduated in Electrical(Telecommunication) Engineering in 1998 (Polytech-Mons), and Statistics-Operations Research in 2008 (ULB). He is an experienced Software Architect with strong background in applied statistics. He developed business critical solutions on inventory control, statistical analysis, risk assessment, fraud detection, web semantic and text mining.
-
Applications of Machine Learning and Soft Computing techniques to human behaviour inference and chemical sensors modelling -- Slides
by: Manuel Pegalajar Cuéllar,
Department of Computer Science and Artificial Intelligence,
University of Granada (Spain)
When:
3 December 2012 at 2PM (until 3PM including questions)
Where: La Plain Campus, Building NO, 8th floor, Rotule room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
Machine Learning and Soft Computing are two key areas in contemporary Artificial Intelligence. Their techniques have been applied to a large number of real applications, obtaining promising results and solving problems where other traditional methods have failed. For instance, this is especially the case when noisy data is present. In this talk, we show our current advances in two areas: Disposable optical sensor modelling and, what will be out main emphasis, human behaviour inference and recognition.
Regarding the first area, we describe different ML techniques used, ranging from classic non-linear regression to neural networks, expert systems and multi-objective optimisation to minimise the size of the sensor. In relation to the second area, we show our approaches for adaptive models of human behaviours using discrete response sensors, and discuss the recent work with active sensors such as cameras and accelerometry sensing in order to learn about locomotion habits and gestures. In the end, we also discuss open problems and some possible ways of collaboration with MLG.
Speaker:
Manuel P. Cuéllar graduated in Computer Engineering in 2003. He finished his PhD on time series prediction, parameter identification and neural networks in 2006. He is currently an associate Professor with the Department of Computer Science and Artificial Intelligence at the University of Granada (Spain). His main interests are neural and social networks, evolutionary optimisation and fuzzy systems, although he has also worked in multivariate image analysis and real-time control tasks. His current work encompasses different research areas including real-time learning, chemical parameter identification, medical imaging, development of disposable optical sensors and intelligent systems for ambient assisted living. He has contributed with more than 40 papers in conferences and research journals and has collaborated in 6 research projects under competitive application from the government of Spain.
2012
-
The role of Self Organizing Dynamic Agents for Decentralized Optimization in Smart Grids -- Slides
by: Prof. Alfredo Vaccaro,
Electric Power Systems at the Department of Engineering,
University of Sannio
When:
September 13, 2012 at 16:00 (until 17:00 including questions)
Where: La Plain Campus, Building NO, 8th floor, Rotule room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
In this Talk we propose a decentralized and self-organizing solution framework aimed at solving Optimal Power Flow (OPF) problems in a distributed scenario. In particular we will demonstrate that, under some hypothesis, the solution of the OPF can obtained by computing proper weighted averages of the variable of interests. To compute these global quantities we propose the deployment of a network of dynamic agents solving a distributed average consensus problem. This bio-inspired solution strategy exhibits several advantages over traditional client server-based paradigms as far as less network bandwidth, less computation time, easy to extend and reconfigure are concerned. These features make the overall computing architecture highly scalable, self-organizing and distributed and thus a potential candidate for addressing the economic dispatch analysis in smart grids.
Speaker:
Alfredo Vaccaro (M?01, SM?09) received the M.Sc. degree with honours in Electronic Engineering in 1998 from the University of Salerno, Salerno, Italy. From 1999 to 2002, he was an Assistant Researcher at the University of Salerno, Department of Electrical and Electronic Engineering. Since March 2002, he has been an Assistant Professor in electric power systems at the Department of Engineering of the University of Sannio, Benevento, Italy. His special fields of interest include soft computing and interval-based method applied to power system analysis and advanced control architectures for diagnostic and protection of distribution networks. Prof. Vaccaro is an Associate Editor and member of the Editorial Boards of IET Renewable Power Generation, the International Journal of Renewable Energy Technology, the International Journal of Reliability and Safety and the International Journal on Power System Optimization. He is the Director of the bureau of the Research Centre on Pure and Applied Mathematic at University of Sannio and the Rector Delegate for Technological Innovations.
-
Credal classification -- Slides
by: Dr. Giorgio Corani,
Dalle Molle Institute for Artificial Intelligence (IDSIA),
Switzerland
When:
June 11, 2012 at 11:00 (until 12:00 including questions)
Where: La Plain Campus, Building NO, 8th floor, Rotule room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
Bayesian networks are important tools for uncertain reasoning in AI. Typically, they are based on a single prior distribution, which is updated through a likelihood yielding a posterior. They are often used for classification. Credal networks generalize Bayesian networks, letting prior probabilities vary in a set (eg., interval). This provides a more realistic model of expert knowledge and returns more robust inferences. Credal classifiers, being based on a set of priors, can identify prior-dependent instances, in which the most probable class varies with the prior. On such instances, credal classifiers return a set of classes (indeterminate classification) rather than a single one, thus preserving reliability. Extensive experiments show that traditional Bayesian classifiers undergo a severe drop of accuracy on prior-dependent instances, over which instead credal classifiers preserve reliability thanks to indeterminate classifications.
Speaker:
G. Corani obtains in 2005 the PhD in Information Engineering at Politecnico di Milano. During the PhD he spends a visiting period at the Machine Learning Group of the Université de Bruxelles. Since 2006 he is researcher at IDSIA, Switzerland. His research interests include probabilistic graphical models, data mining, imprecise probabilities.
-
Cooperative decision-making in cell regulation"
by: Kim van Roey,
Gibson Team, European Molecular Biology Laboratory (EMBL) Heidelberg,
Meyerhofstraße 1, 69117 Heidelberg, Germany
When:
March 30, 2012 at 12:30 (until 14:00 including questions)
Where: La Plain Campus, Building NO, 5th floor, Solvay room
Université Libre de Bruxelles
Boulevard du Triomphe
1050 Bruxelles
Abstract:
Cells must continuously monitor and integrate the variety of signals they perceive in order to generate appropriate responses. This requires reliable and robust signal transduction, which is mediated by an intricate and interlinked network of pathways and processes that are tightly regulated. Assembly of the dynamic macromolecular complexes that modulate these pathways often depends on multiple transient, low-affinity interactions that are context-dependent, highly cooperative and easily tuneable. Such interactions provide the dynamic plasticity that is required for proper cell signalling and underlie the ability of proteins to act as switchable regulatory modules. This raises the question, how do proteins integrate available information to correctly make decisions? This talk addresses the role of intrinsically disordered protein regions, and more specifically short linear motifs, in cooperative decision-making and briefly introduces our current efforts to computationally describe cooperative interactions.
-
Structure, Unstructure and Alternative Splicing
Dr. Philip Kim, The Donnelly Centre for Cellular and Biomolecular Research, Banting and Best Department of Medical Research, Departments of Molecular Genetics and Computer Science University of Toronto
Wednesday 22 February 2012 from 16:00 to 17:30 (+ questions) - ULB La Plaine campus, Building NO, Salle Solvay (5th floor in the rotule)
Abstract:
Many protein interactions, in particular those in signaling networks, are mediated by peptide recognition domains. These recognize short, linear amino acid stretches on the surface of their cognate partners with high specificity. Residues in these stretches are usually assumed to contribute independently to binding, which has led to a simplified understanding of protein interactions. Conversely, in large binding peptide data sets different residue positions display highly significant correlations for many domains in three distinct families (PDZ, SH3 and WW). These correlation patterns reveal a widespread occurrence of multiple binding specificities and give novel structural insights into protein interactions. For example, a new binding mode of PDZ domains can be predicted and structurally rationalized for DLG1 PDZ1.
While protein structure is very important for peptide binding domains, the regions they bind are usually unstructured (intrinsically disordered). These regions are widespread, especially in proteomes of higher eukaryotes, and have been associated with a plethora of different cellular functions. Aside from general importance for signaling networks, they are also important for such diverse processes as protein folding or DNA binding. Leveraging knowledge from systems biology can help to structure the phenomenon. Strikingly, disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (“flexible disorder”), regions of conserved disorder with also highly conserved amino acid sequence (“constrained disorder”) and, lastly, non-conserved disorder. I will also introduce new efforts to map protein interactions affected by alternative splicing.
by Dr. Piers D. Nash The Ben May Department for Cancer Research, The University of Chicago Chicago, IL, USA
Tuesday 14 February 2012 from 11:00 to 12:30 (+ questions) - room : D.005 (
VUB campus, building D, lower floor)
Abstract: Modular protein interaction domains (PIDs), such as the SH2 domain, are a common feature of many proteins, particularly those involved in cellular signal transduction. The SH2 domain recognizes phosphotyrosine modified peptide sequences, and in doing so couples tyrosine kinases to downstream signaling networks. We have examined the evolution of SH2 domains and find that they expand rapidly with the emergence of multicellularity and subsequent expansions concomitant with leaps in organismal complexity within the animal lineage. Increasing connectivity within and between SH2 proteins may underlie more highly interconnected and robust signal transduction networks. Yet the rapid evolutionary expansion of SH2 domains comes at some cost to selectivity so that the extant SH2 domains explore only a small region of the available peptide ligand sequence space. The ability of PIDs to nucleate highly selective interactions is essential for signal fidelity yet relies on limited peptide sequence information. For instance, SH2 domains may appear to have simple binding motifs characterized by a few residues surrounding a phosphotyrosine (eg. pY-X-X-P/L). We have recently shown that by reading both permissive and non-permissive residues and longer regions of adjacent sequence, the SH2 domain is able to make use of a wider information channel to prescribe selective interactions. This results in a complex language for SH2 domain-peptide interactions in which the SH2 domain is readily able to distinguish physicochemically similar amino acids. Thus, despite evolutionary constraints, individual SH2 domains have distinct recognition profiles and exhibit a remarkable degree of selectivity.
Speaker: Dr. Piers D. Nash is a world-renowned scientist investigating protein-protein interactions involved in signal transduction, and the molecular mechanisms by which cells respond to external cues. After completing a postdoctoral position in the lab of Tony Pawson in Toronto, he became Assistant Professor in The Ben May Department for Cancer Research and a Scientist of the Comprehensive Cancer Center at The University of Chicago. His current work focuses on understanding the SH2 domain at a systems level and investigating the role of ubiquitination in controlling endocytosis and modulating signal transduction.
2011
- "Ensemble learning for real-world classification."
by Nima Hatami, Department of Electrical and Electronic Engineering, University of Cagliari, Italy -- Slides
Wednesday 18 November 2012 from 12:30 to 13:30 (+questions) - room: NO7.07 - NO building
Abstract: Most real-world classification problems are too complicated to be tackled by a single expert. An alternative approach is to use ensemble of experts inspired by Divide-and-conquer principle which has proven to be efficient in many of these cases. A complex problem is first divided into some simpler sub-problems, each of them assigned to an expert. The final solution of the problem obtained by consensus of experts, is proven to be more effective and efficient. This talk will cover the application of different multiple-classifier systems to some real-world classification problems e.g. gene expression cancer classification, face recognition and text categorization.
- "Representing Cooperative Interactions in Bioinformatics."
Thursday 22 december 2011 from 12:30 to 13:30 (+questions) - room: NO6.07 - NO building Cancelled
Abstract: Cells must continuously monitor external and internal cues, integrate the variety of signals they perceive, and translate these inputs into proper outputs. This requires reliable and robust signal transduction, which is mediated by intricate and interlinked networks of pathways and processes that are tightly regulated. Assembly of the dynamic macromolecular complexes that modulate these pathways depends on multiple transient, low-affinity interactions, many of which are regulated by post-translational modifications. These distinct binding events are highly cooperative, affecting each other either positively or negatively. Such cooperative interactions provide the dynamic plasticity that is required for proper cell signaling. However, despite the central importance of cooperativity in these systems, it is missing from all current formalisms for describing molecular interactions. This talk addresses our current efforts to computationally describe cooperative interactions.
- "Cartification: from Similarities to Itemset Frequencies."
by Bart Goethals, Professor, Department of Mathematics and Computer Science, University of Antwerp, Belgium -- Slides
Thursday 17 November 2011 from 14:30 to 15:30 (+questions) - Rotule NO8 - NO building
Abstract: Suppose we are given a multi-dimensional dataset. For every point in the dataset, we create a transaction, or cart, in which we store the k-nearest neighbors of that point for one of the given dimensions. The resulting collection of carts can then be used to mine frequent itemsets; that is, sets of points that are frequently seen together in some dimensions. Experimentation shows that finding clusters, outliers, cluster centers, or even subspace clustering becomes easy on the cartified dataset using state-of-the-art techniques in mining interesting itemsets.
- "Unraveling networks of co-regulated genes on the sole basis of genomesequences."
by Sylvain Brohée, Post-doc, ULB -- Slides
Wednesday 21 September 2011 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
Abstract: With the growing number of available microbial genome sequences, regulatorysignals can now be revealed as conserved motifs in promoters of orthologousgenes (phylogenetic footprints). A next challenge is to unravel genome-scaleregulatory networks. Using as sole input genome sequences, we predicted cis-regulatory elements for each gene of the yeast Saccharomyces cerevisiae bydiscovering over-represented motifs in the promoters of their orthologs in 19 Saccharomycetes species. We then linked all genes displaying similar motifs intheir promoter regions and inferred a co-regulation network including 56919 links between 3171 genes. Comparison with annotated regulons highlights thehigh predictive value of the method: a majority of the top-scoring predictionscorrespond to already known co-regulations. We also show that this inferrednetwork is as accurate as a co-expression network built from hundreds oftranscriptome microarray experiments. Furthermore, we experimentally validated14 among 16 new functional links between orphan genes and known regulons. Thisapproach can be readily applied to unravel gene regulatory networks fromhundreds of microbial genomes for which no other information is availableexcept the sequence. Long-term benefits can easily be perceived whenconsidering the exponential increase of new genome sequences.
- "Predicting structured-output from protein sequence"
by Andrea Passerini, Assistant Professor, Università degli Studi di Trento
Friday 2 September 2011 from 14:00 to 15:00 (+questions) - NO7.07 - NO building
Abstract: Recent advances in high-throughput sequencing techniques are drastically increasing the amount of biological sequences available for further study. On the other hand, experimentally determining their three-dimensional structure is an expensive and time-consuming process. In this scenario, automatic approaches to sequence analysis are crucial in order to fill this gap and devise information on their biological function.
I will present machine learning techniques for predicting protein structural features from sequence. The talk will focus on challenging problems where the desired output is a discrete structure, e.g. a graph connecting certain residues in the sequence. I will first discuss the prediction of disulphide bridges, i.e. covalent bonds between pairs of cysteines, which help stabilizing protein 3D structure and have a relevant structural and functional role. This task can be effectively addressed with a nearest-neighbour approach in the space of candidate configurations. I will then introduce the problem of metal binding site prediction, whose characteristics prevent the application of this method. I will present a search-based structured-output technique relying on an online strategy learning to discriminate between correct and incorrect moves. The advantages and drawbacks of these algorithms will be discussed together to their applicability to other structured-output problems.
- "Efficient prediction of patterns for context-aware embedded systems"
by Yves Vanrompay, researcher in the Embedded and Ubiquitous Systems taskforce of the Distrinet research group in the department of computer science of the Katholieke Universiteit Leuven
Monday 11 April 2011 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Machine learning and Web Mining"
by Doru Tanasa, full time faculty at the International University of Monaco, part-time R&D engineer for Up&Net.
Thursday 7 April 2011 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Beyond Space For Spatial Networks"
by ARenaud Lambiotte, Imperial College, UK
Friday 25 February 2011 from 14:00 to 15:00 (+questions) - A2.122 - A building
- "Automatic Recognition of Multiparty Human Interactions using Dynamic Bayesian Networks"
by Alfred Dielmann, Research Scientist at the Instiute Telecom ParisTech, Paris, France
Thursday 24 February 2011 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Statistical and relational learning for understanding enzyme function"
by Elisa Cilia, University of Trento, Italy
Friday 4 February 2011 from 12:00 to 13:00 (+questions) - Rotule NO8 - NO building
- "Predictive Network Inference in Colon Cancer"
by Catharina Olsen, PhD student, MLG, ULB
Thursday 27 January 2011 from 12:30 to 13:30 (+questions) - NO7.08 - NO building
2010
- "Automated analysis of biological oscillator models using mode decomposition"
by Tomasz Konopka, Postdoc, Service de Biosystèmes, Biomodélisation et Bioprocédés (3Bio)
Thursday 28 October 2010 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Statistical issues in the development of clinically useful biomarkers in oncology from microarrays"
by Stefan Michiels (Bordet, ULB)
Friday 22 October 2010 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Deep Web mining and knowledge mining using machine learning"
by Lu Jiang (Erasmus Mundus Exchange Program fellowship)
Friday 01 October 2010 from 15:00 to 16:00 (+questions) - Rotule NO8 - NO building
- "The Bag-of-Frames approach to music genre classification: Challenges and Limitations"
by Miguel Lopes (PhD student, Universidade do Porto)
Wednesday 25 August 2010 from 16:30 to 17:30 (+questions) - Rotule NO8 - NO building
- "Solving Non-Convex Lasso Type Problems With DC Programming"
by Romain Herault (INSA de Rouen, France)
Thursday March 4, 2010, 12:30PM (+questions) - room: NO7.08 - NO building
Abstract: We propose a novel algorithm for addressing variable selection (or sparsity recovering) problem using non-convex penalties. A generic framework based on a DC programming is presented and yields to an iterative weighted lasso-type problem. We have then showed that many existing approaches for solving such a non-convex problem are particular cases of our algorithm. We also provide some empirical evidence that our algorithm outperforms existing ones. Based on the article:
G. Gasso, A. Rakotomamonjy, S. Canu, Recovering sparse signals with non-convex penalties and DC programming, IEEE Trans. Signal Processing, Vol 57, no.12, pp 4686-4698, 2009.
- "Game Tree Search Strategies for Computer Poker"
by Boris Iolis (ULB graduate student)
Thursday 14 January 2010 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
2009
- "Biomarker selection from microarray data: a transfer learning approach"
by Pierre Dupont (UCL, Belgium)
Friday 13 November 2009 from 14:30 to 15:30 (+questions) - Rotule NO8 - NO building
- "Network Inference based on Information Theory Applied to Microarray Data"
by Patrick E. Meyer (ULB, Machine Learning Group)
Thursday 5 November 2009 from 12:30 to 13:30 (+questions) - Rotule NO8 - NO building
- "The role of cooperative sensor networks in smart grids"
by Alfredo Vaccaro (U. Sannio, Italy)
30 April 2009 from 15:30 to 16:30 (+questions) - Rotule NO8 - NO building
2008
- "Exploratory Analysis of Functional Data via Clustering and Segmentation"
by Fabrice Rossi (Telecom ParisTech)
Tuesday 16 December 2008 from 14:00 to 15:00 (+questions) - Rotule NO8 - NO building
- "Stochastic self-similar processes and large scale structures"
by Marta Chinnici, PhD (University of Napoli "Federico II")
Monday 10 November 2008 from 10:30 to 11:30 (+questions) - Salle de séminaire NO8 - NO building
- "Data mining with SAS"
by Hadrien Polastro (ULB graduate student)
Wednesday 1 Ocotber 2008 from 10:00 to 11:30 - Rotule NO8 - NO building
- Computer Science Department Seminar
"Distributed Indexing and Querying in Sensor Networks using Statistical Models"
by Arnab Bhattacharya from Indian Institute of Technology, Kanpur
Thursday 17 July 2008 from 12:30 @ rotule NO8 (NO building)
- "q-Nested Partial Correlation Graphs for Genetic Network Inference"
by Kevin Kontos from ULB/MLG
Friday 4 July 2008 @ 11:00 - Rotule NO8 - NO building
- "Proposals for Trustable Visualization of High-Dimensional Data"
by Abhilash Miranda and Gianluca Bontempi from ULB/MLG
Tuesday 01 May 2008 @ Indian Institute of Technology, Kanpur
- "Trends in Dimensionality Reduction for Visualization"
by Abhilash Miranda from ULB/MLG
Thursday 13 March 2008 @ 14:30 - Rotule NO8 - NO building
- "Customer Intelligence and Data Mining"
by Martine George from ING Belgium
Tuesday 12 February 2008 @ 12:00 - UB4.136 - Solbosch
- "An Introduction to Entropy Estimation"
by Catharina Olsen from ULB/MLG
Wednesday 6 February 2008 @ 14:30 - Rotule NO8 - NO building
- "Spatial Data Mining: Exemples d'Application à la Détection/Prédiction du Changement"
by Hussein Atoui from ULB/MLG
Tuesday 29 January 2008 @ 14:00 - Rotule NO8 - NO building
- "Hierarchical Visualization using Mixture of PCAs"
by Abhilash Miranda from ULB/MLG
Tuesday 22 January 2008 @ 14:30 - Rotule NO8 - NO building