The Junior Research Group "Exploratory Data Analysis", headed by Jilles Vreeken.

Exploratory Data Analysis
Vision and Research Strategy
Currently we are investigating statistical and information theoretic techniques for identifying informative local structures such as patterns in large graphs as well as large collections of real-valued data, how to efficiently mine good data descriptions directly from rich data, and study well-founded approaches for meaningfully comparing between, and validation of, exploratory data analysis results.
Composition of Group
Our group was founded in October 2013 in the Cluster of Excellence. Currently it consists of dr. Jilles Vreeken (head), dr. Mario Boley (postdoc), Kailash Budhathoki, Sebastian Dalleiger, Jonas Fischer, Janis Kalofolias, David Kaltenpoth, Panagiotis Mandros, and Alexander Marx (PhD students).
We are always looking for PhD candidates, postdocs, or HiWis, with background and interest in data mining, machine learning, statistics, and/or mathematics.

Dr. Jilles Vreeken
Jilles Vreeken is the head of the Independent Research Group on Exploratory Data Analysis within the Cluster of Excellence.
Publications
- Mario Boley and Bryan R. Goldsmith and Luca M. Ghiringhelli and Jilles Vreeken Identifying consistent statements about numerical data with dispersion-corrected subgroup discovery In: Data Min. Knowl. Discov., 2017
- Janis Kalofolias and Mario Boley and Jilles Vreeken Efficiently Discovering Locally Exceptional Yet Globally Representative Subgroups In: 2017 IEEE International Conference on Data Mining, ICDM 2017, New Orleans, LA, USA, November 18-21, 2017, 2017
- Alexander Marx and Jilles Vreeken Telling Cause from Effect Using MDL-Based Local and Global Regression In: 2017 IEEE International Conference on Data Mining, ICDM 2017, New Orleans, LA, USA, November 18-21, 2017, 2017
- Kailash Budhathoki and Jilles Vreeken MDL for Causal Inference on Discrete Data In: 2017 IEEE International Conference on Data Mining, ICDM 2017, New Orleans, LA, USA, November 18-21, 2017, 2017
- Panagiotis Mandros and Mario Boley and Jilles Vreeken Discovering Reliable Approximate Functional Dependencies In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017, 2017
- Roel Bertens and Jilles Vreeken and Arno Siebes Efficiently Discovering Unexpected Pattern-Co-Occurrences In: Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27-29, 2017., 2017
- Kailash Budhathoki and Jilles Vreeken Correlation by Compression In: Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27-29, 2017., 2017
- Robert Pienta and Minsuk Kahng and Zhiyuan Lin and Jilles Vreeken and Partha P. Talukdar and James Abello and Ganesh Parameswaran and Duen Horng Chau FACETS: Adaptive Local Exploration of Large Graphs In: Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27-29, 2017., 2017
- Apratim Bhattacharyya and Jilles Vreeken Efficiently Summarising Event Sequences with Rich Interleaving Patterns In: Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27-29, 2017., 2017
- Kumaripaba Athukorala and Dorota Glowacka and Giulio Jacucci and Antti Oulasvirta and Jilles Vreeken Is exploratory search different? A comparison of information search behavior for exploratory and lookup tasks In: Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27-29, 2017., 2016
- Kailash Budhathoki and Jilles Vreeken Causal Inference by Compression In: IEEE 16th International Conference on Data Mining, ICDM 2016, December 12-15, 2016, Barcelona, Spain, 2016
- Roel Bertens and Jilles Vreeken and Arno Siebes Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, 2016
- Polina Rozenshtein and Aristides Gionis and B. Aditya Prakash and Jilles Vreeken Reconstructing an Epidemic Over Time In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, 2016
- Hoang Vu Nguyen and Jilles Vreeken Flexibly Mining Better Subgroups In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2016
- Hoang Vu Nguyen and Panagiotis Mandros and Jilles Vreeken Universal Dependency Analysis In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2016
- Hoang Vu Nguyen and Jilles Vreeken Linear-time Detection of Non-linear Changes in Massively High Dimensional Time Series In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2016
- Hoang Vu Nguyen and Emmanuel Müller and Jilles Vreeken and Klemens Böhm Erratum to: Unsupervised interaction-preserving discretization of multivariate data In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2015
- Arthur Zimek and Jilles Vreeken The blind men and the elephant: on meeting the problem of multiple truths in data from clustering and pattern mining perspectives In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2015
- Danai Koutra and U. Kang and Jilles Vreeken and Christos Faloutsos Summarizing and understanding large graphs In: Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016, 2015
- Hoang Vu Nguyen and Jilles Vreeken Non-parametric Jensen-Shannon Divergence In: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part II, 2015
- Kailash Budhathoki and Jilles Vreeken The Difference and the Norm - Characterising Similarities and Differences Between Databases In: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part II, 2015
- Sanjar Karaev and Pauli Miettinen and Jilles Vreeken Getting to Know the Unknown Unknowns: Destructive-Noise Resistant Boolean Matrix Factorization In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2015
- Shashidhar Sundareisan and Jilles Vreeken and B. Aditya Prakash Hidden Hazards: Finding Missing Nodes in Large Graph Epidemics In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2015
- Jilles Vreeken Causal Inference by Direction of Information In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2015
- B. Aditya Prakash, Jilles Vreeken & Christos Faloutsos Efficiently Spotting the Starting Points of an Epidemic in a Large Graph In: Knowledge and Information Systems, 2014
- Nguyen, Hoang-Vu and Müller, Emmanuel and Vreeken, Jilles and Böhm, Klemens Multivariate Maximal Correlation Analysis In: Proceedings of the International Conference on Machine Learning (ICML), 2014
- Nguyen, Hoang-Vu and Müller, Emmanuel and Vreeken, Jilles and Böhm, Klemens Unsupervised Interaction-Preserving Discretization of Multivariate Data In: Data Mining and Knowledge Discovery, 2014
- Danai Koutra, U Kang, Jilles Vreeken & Christos Faloutsos VoG: Summarizing and Understanding Large Graphs In: Proceedings of the SIAM International Conference on Data Mining (SDM), 2014
- B. Aditya Prakash and Jilles Vreeken and Christos Faloutsos Efficiently spotting the starting points of an epidemic in a large graph In: Knowl. Inf. Syst., 2014
- Hoang Vu Nguyen and Emmanuel Müller and Jilles Vreeken and Klemens Böhm Unsupervised interaction-preserving discretization of multivariate data In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2014
- Hao Wu and Jilles Vreeken and Nikolaj Tatti and Naren Ramakrishnan Uncovering the plot: detecting surprising coalitions of entities in multi-relational schemas In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2014
- Pauli Miettinen and Jilles Vreeken MDL4BMF: Minimum Description Length for Boolean Matrix Factorization In: Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30 - May 2, 2015, 2014
- Kumaripaba Athukorala and Antti Oulasvirta and Dorota Glowacka and Jilles Vreeken and Giulio Jacucci Narrow or Broad?: Estimating Subjective Specificity in Exploratory Search In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014, 2014
- Erdal Kuzey and Jilles Vreeken and Gerhard Weikum A Fresh Look on Knowledge Bases: Distilling Named Events from News In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014, 2014
- Hoang Vu Nguyen and Emmanuel Müller and Jilles Vreeken and Pavel Efros and Klemens Böhm Multivariate Maximal Correlation Analysis In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, 2014
- Danai Koutra and U. Kang and Jilles Vreeken and Christos Faloutsos VOG: Summarizing and Understanding Large Graphs In: Proceedings of the 2014 SIAM International Conference on Data Mining, Philadelphia, Pennsylvania, USA, April 24-26, 2014, 2014
- Kumaripaba Athukorala and Antti Oulasvirta and Dorota Glowacka and Jilles Vreeken and Giulio Jacucci Interaction Model to Predict Subjective-Specificity of Search Results In: Posters, Demos, Late-breaking Results and Workshop Proceedings of the 22nd Conference on User Modeling, Adaptation, and Personalization co-located with the 22nd Conference on User Modeling, Adaptation, and Personalization (UMAP2014), Aalborg, Denmark, July 7-11, 2014., 2014
- Kumaripaba Athukorala and Antti Oulasvirta and Dorota Glowacka and Jilles Vreeken and Giulio Jacucci Supporting Exploratory Search Through User Modeling In: Posters, Demos, Late-breaking Results and Workshop Proceedings of the 22nd Conference on User Modeling, Adaptation, and Personalization co-located with the 22nd Conference on User Modeling, Adaptation, and Personalization (UMAP2014), Aalborg, Denmark, July 7-11, 2014., 2014
- Jilles Vreeken and Nikolaj Tatti Interesting Patterns In: Frequent Pattern Mining, 2014
- Matthijs van Leeuwen and Jilles Vreeken Mining and Using Sets of Patterns through Compression In: Frequent Pattern Mining, 2014
- Arthur Zimek and Ira Assent and Jilles Vreeken Frequent Pattern Mining Algorithms for Data Clustering In: Frequent Pattern Mining, 2014
- Arno Siebes and Jilles Vreeken and Matthijs van Leeuwen Frequent Pattern Mining In: Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA, 2014
- Michael Mampaey & Jilles Vreeken Summarizing Categorical Data by Clustering Attributes In: Data Mining and Knowledge Discovery, 2013
- Leman Akoglu, Jilles Vreeken, Hanghang Tong, Duen Horn Chau, Nikolaj Tatti & Christos Faloutsos Mining Connection Pathways for Marked Nodes in Large Graphs In: Proceedings of the SIAM International Conference on Data Mining (SDM), 2013
- Hoang-Vu Nguyen, Emmanuel Müller, Jilles Vreeken, Fabian Keller & Klemens Böhm CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection In: Proceedings of the SIAM International Conference on Data Mining (SDM), 2013
- Jan Ramon, Pauli Miettinen & Jilles Vreeken Detecting Bicliques in GF[q] In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'13), 2013
- Kleanthis-Nikolaos Kontonasios, Jilles Vreeken & Tijl De Bie Maximum Entropy Models for Iteratively Identifying Subjectively Interesting Structure in Real-Valued Data In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'13), 2013
- Emin Aksehirli, Bart Goethals, Emmanuel Müller & Jilles Vreeken Mining Frequently Co-occurring Object Sets over Multiple Neighborhoods In: Proceedings of the IEEE International Conference on Data Mining (ICDM), 2013
- Leman Akoglu, Emmanuel Müller & Jilles Vreeken Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description (ODD'13) In: 2013
- Duen Horn Chau, Jilles Vreeken, Matthijs van Leeuwen & Christos Faloutsos Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA'13) In: 2013
- Michael Mampaey, Jilles Vreeken & Nikolaj Tatti Summarizing Data Succinctly with the Most Informative Itemsets In: Transactions on Knowledge Discovery from Data, 2013
- Michael Mampaey and Jilles Vreeken Summarizing categorical data by clustering attributes In: Data Min. Knowl. Discov., 2013
- Emin Aksehirli and Bart Goethals and Emmanuel M Cartification: A Neighborhood Preserving Transformation for Mining High Dimensional Data In: ICDM, 2013
- Kleanthis-Nikolaos Kontonasios and Jilles Vreeken and Tijl De Bie Maximum Entropy Models for Iteratively Identifying Subjectively Interesting Structure in Real-Valued Data In: ECML/PKDD (2), 2013
- Jan Ramon and Pauli Miettinen and Jilles Vreeken Detecting Bicliques in GF[q] In: ECML/PKDD (1), 2013
- Leman Akoglu and Duen Horng Chau and Christos Faloutsos and Nikolaj Tatti and Hanghang Tong and Jilles Vreeken Mining Connection Pathways for Marked Nodes in Large Graphs In: SDM, 2013
- Klemens B CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection In: SDM, 2013
- Geoffrey I. Webb and Jilles Vreeken Efficient Discovery of the Most Interesting Associations In: Frequent Pattern Mining, 2013
- Nikolaj Tatti & Jilles Vreeken Comparing Apples and Oranges - Measuring Differences between Exploratory Data Mining Results In: Data Mining and Knowledge Discovery, 2012
- Leman Akoglu, Hanghang Tong, Jilles Vreeken & Christos Faloutsos CompreX: Compression based Anomaly Detection In: Proceedings of ACM Conference on Information and Knowledge Mining (CIKM'12), 2012
- B. Aditya Prakash, Jilles Vreeken & Christos Faloutsos Spotting Culprits in Epidemics: How many and Which ones? In: Proceedings of the IEEE International Conference on Data Mining (ICDM'12), 2012
- Koen Smets & Jilles Vreeken Slim: Directly Mining Descriptive Patterns In: Proceedings of the SIAM International Conference on Data Mining (SDM'12), 2012
- Nikolaj Tatti & Jilles Vreeken The Long and the Short of It: Summarizing Event Sequences with Serial Episodes In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12), 2012
- Nikolaj Tatti & Jilles Vreeken Discovering Descriptive Tile Trees by Fast Mining of Optimal Geometric Subtiles In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'12), 2012
- Leman Akoglu, Jilles Vreeken, Hanghang Tong, Duen Horng Chau & Christos Faloutsos Mining and Visualizing Connection Pathways in Large Information Networks In: Proceedings of the Workshop on Information in Networks (WIN'12), 2012
- Duen Horng Chau, Leman Akoglu, Jilles Vreeken, Hanghang Tong & Christos Faloutsos TourViz: Interactive Visualization of Connection Pathways in Large Graphs In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12), 2012
- Duen Horng Chau, Leman Akoglu, Jilles Vreeken, Hanghang Tong & Christos Faloutsos Interactively and Visually Exploring Tours of Marked Nodes in Large Graphs In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM'12), 2012
- Jilles Vreeken & Nikolaj Tatti Summarising Event Sequences with Serial Episodes In: Proceedings of the fifth Workshop on Information Theoretic Methods in Science and Engineering (WITMSE'12), 2012
- Hao Wu, Michael Mampaey, Nikolaj Tatti, Jilles Vreeken, M.Shariar Hossain & Naren Ramakrishnan Where Do I Start? Algorithmic Strategies to Guide Intelligence Analysts In: Proceedings of the ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD'12), 2012
- Jilles Vreeken, Charles X. Ling, Arno Siebes, Mohammed J. Zaki, Jeffrey Xu Yu, Bart Goethals, Geoffrey I. Webb, Xindong Wu Proceedings of the 12th IEEE International Conference on Data Mining Workshops (ICDMW'12), IEEE, 2012 In: IEEE, 2012
- Jilles Vreeken, Matthijs van Leeuwen, Siegfried Nijssen, Nikolaj Tatti, Anton Dries & Bart Goethals Proceedings of the ECML PKDD Workshop on Instant Interactive Data Mining (IID) In: 2012
- Nikolaj Tatti and Jilles Vreeken Comparing apples and oranges: measuring differences between exploratory data mining results In: Data Min. Knowl. Discov., 2012
- Michael Mampaey and Jilles Vreeken and Nikolaj Tatti Summarizing data succinctly with the most informative itemsets In: TKDD, 2012
- Duen Horng Chau and Leman Akoglu and Jilles Vreeken and Hanghang Tong and Christos Faloutsos Interactively and Visually Exploring Tours of Marked Nodes in Large Graphs In: ASONAM, 2012
- Leman Akoglu and Hanghang Tong and Jilles Vreeken and Christos Faloutsos Fast and reliable anomaly detection in categorical data In: CIKM, 2012
- B. Aditya Prakash and Jilles Vreeken and Christos Faloutsos Spotting Culprits in Epidemics: How Many and Which Ones? In: ICDM, 2012
- Nikolaj Tatti and Jilles Vreeken The long and the short of it: summarising event sequences with serial episodes In: KDD, 2012
- Duen Horng Chau and Leman Akoglu and Jilles Vreeken and Hanghang Tong and Christos Faloutsos TourViz: interactive visualization of connection pathways in large graphs In: KDD, 2012
- Nikolaj Tatti and Jilles Vreeken Discovering Descriptive Tile Trees - By Mining Optimal Geometric Subtiles In: ECML/PKDD (1), 2012
- Koen Smets and Jilles Vreeken Slim: Directly Mining Descriptive Patterns In: SDM, 2012
- - 12th IEEE International Conference on Data Mining Workshops, ICDM Workshops, Brussels, Belgium, December 10, 2012 In: ICDM Workshops, 2012
- Jilles Vreeken, Matthijs van Leeuwen & Arno Siebes Krimp: Mining Itemsets that Compress In: Data Min. Knowl. Discov., 2011
- Kleanthis-Nikolaus Kontonasios, Jilles Vreeken & Tijl De Bie Maximum Entropy Modelling for Assessing Results on Real-Valued Data In: Proceedings of the IEEE International Conference on Data Mining (ICDM'11), 2011
- Bart Goethals, Sandy Moens & Jilles Vreeken MIME: A Framework for Interactive Visual Pattern Mining In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECMLPKDD'11), 2011
- Jilles Vreeken & Arthur Zimek When Pattern met Subspace Cluster, a Relationship Story In: Proceedings of the 2nd Workshop on Discovering, Summarizing and Using Multiple Clusterings (MultiClust'11), 2011
- Bart Goethals, Sandy Moens & Jilles Vreeken MIME: A Framework for Interactive Visual Pattern Mining In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11), 2011
- Nikolaj Tatti & Jilles Vreeken Comparing Apples and Oranges: Measuring Differences between Data Mining Results In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECMLPKDD'11), 2011
- Koen Smets & Jilles Vreeken The Odd One Out: Identifying and Characterising Anomalies In: Proceedings of the SIAM International Conference on Data Mining (SDM'11), 2011
- Michael Mampaey, Nikolaj Tatti & Jilles Vreeken Tell Me What I Need to Know: Succinctly Summarizing Data with Itemsets In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11), 2011
- Jilles Vreeken and Matthijs van Leeuwen and Arno Siebes Krimp: mining itemsets that compress In: Data Min. Knowl. Discov., 2011
- Pauli Miettinen & Jilles Vreeken Model Order Selection for Boolean Matrix Factorization In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11), 2011
- Kleanthis-Nikolaos Kontonasios and Jilles Vreeken and Tijl De Bie Maximum Entropy Modelling for Assessing Results on Real-Valued Data In: ICDM, 2011
- Pauli Miettinen and Jilles Vreeken Model order selection for boolean matrix factorization In: KDD, 2011
- Michael Mampaey and Nikolaj Tatti and Jilles Vreeken Tell me what i need to know: succinctly summarizing data with itemsets In: KDD, 2011
- Bart Goethals and Sandy Moens and Jilles Vreeken MIME: a framework for interactive visual pattern mining In: KDD, 2011
- Jilles Vreeken and Arthur Zimek When Pattern Met Subspace Cluster In: MultiClust@ECML/PKDD, 2011
- Nikolaj Tatti and Jilles Vreeken Comparing Apples and Oranges - Measuring Differences between Data Mining Results In: ECML/PKDD (3), 2011
- Bart Goethals and Sandy Moens and Jilles Vreeken MIME: A Framework for Interactive Visual Pattern Mining In: ECML/PKDD (3), 2011
- Koen Smets and Jilles Vreeken The Odd One Out: Identifying and Characterising Anomalies In: SDM, 2011
- Jilles Vreeken Making pattern mining useful In: SIGKDD Explorations, 2010