Workshops

Monday 24th September
MUSE: Mining Ubiquitous and Social Environments
NFMCP: New Frontiers in Mining Complex Patterns
Silver: The Silver Lining – learning from unexpected results
IID: Instant Interactive Data Mining
LDSSB: Learning and Discovery in Symbolic Systems Biology
Friday 28th September
SDAD: Sentiment Discovery from Affective Data
ALRA: Active Learning in Real-world Applications
I-Pat: Mining and exploiting interpretable local patterns
COMMPER: Community Mining and People Recommenders
CoLISD: Collective Learning and Inference on Structured Data
Discovery Challenge: Third Challenge on Large Scale Hierarchical Text Classification

Important Deadlines

Some Workshops have extended the paper submission deadline. More information on the websites of the Workshops.

Deadline for submissions: June 29, 2012
Author notification: July 20, 2012
Camera-ready papers due: August 3, 2012
Workshops takes place: September 24, 2012
September 28, 2012

Details about the submission process for each workshop can be found at the corresponding website.

 

Monday Workshops (24th September 2012)

 

MUSE: Mining Ubiquitous and Social Environments

Martin Atzmueller and Andreas Hotho

Website

The goal of this workshop is to promote an interdisciplinary forum for researchers working in the fields of ubiquitous computing, social web, Web 2.0, and social networks which are interested in utilizing data mining in a ubiquitous setting. The workshop seeks for contributions adopting state-of-the-art mining algorithms on ubiquitous social data. Papers combining aspects of the two fields are especially welcome. In short, we want to accelerate the process of identifying the power of advanced data mining operating on data collected in ubiquitous and social environments, as well as the process of advancing data mining through lessons learned in analyzing these new data.

 

NFMCP: New Frontiers in Mining Complex Patterns

Annalisa Appice, Michelangelo Ceci, Corrado Loglisci, Giuseppe Manco, Elio Masciari and Zbigniew Ras

Website

NFMCP aims at bringing together researchers and practitioners of data mining interested in exploring emerging technologies and applications where complex patterns in expressive languages are principally extracted from new prominent data sources like blogs, event or log data, biological data, spatio-temporal data, social networks, mobility data, sensor data and streams, and so on. We are interested in advanced techniques which preserve the informative richness of data and allow us to efficiently and efficaciously identify complex information units present in such data.

 

Silver: The Silver Lining – learning from unexpected results

Joaquin Vanschoren and Wouter Duivesteijn

Website

This workshop is dedicated to the proposition that insight often begins with unexpected results. Unexpected results chart the boundaries of our knowledge: they identify errors, reveal false assumptions, and force us to dig deeper. Unfortunately, this process is rarely mentioned in the machine learning and data mining discourse. Indeed, there exists a publication bias that favors (incremental) successes over novel discoveries of why some ideas, while intuitive and plausible, do not work. With this workshop, we want to give a voice to unexpected results that deserve wider dissemination: thoroughly conducted studies that follow a plausible idea that did not achieve the aspired results, but instead taught us novel lessons; studies showing that well-known (successful) methods will not work under certain conditions, highlighting remaining weaknesses and new avenues of research; and stories that focus on how a successful method was discovered after one or several failed attempts.

 

IID: Instant Interactive Data Mining

 Jilles Vreeken, Nikolaj Tatti, Bart Goethals, Anton Dries, Matthijs van Leeuwen, Siegfried Nijssen

Website

At IID’12 we will discuss data mining techniques that allow users to interactively explore their data, receiving near-instant updates to every requested refinement. While Instant mining and Stream mining start from different perspectives and operate under different constraints, there is a significant overlap in techniques and developments in either setting can have a significant impact on the other. Therefore, this workshop aims to bring together researchers interested in instant and adaptive data mining methods, whether for use in interactive systems or in the processing of large streams of evolving data.

 

LDSSB: Learning and Discovery in Symbolic Systems Biology

Oliver Ray and Katsumi Inoue

Website

Symbolic Systems Biology is a rapidly emerging field involving the application of formal logic-based methods to Systems Biology. Recently a spectrum of such approaches have begun to demonstrate their utility in modelling and analysing a variety of biological phenomena. Examples include Boolean logic, classical logic, modal logics, hybrid logic, rewriting logic, computational logics, constraint programming, formal methods, process calculi, graphical models, and many more. The primary aim of this workshop is to explore how machine learning and knowledge discovery techniques can be used within such formalisms to help learn and revise biological models. A secondary aim is to investigate how symbolic methods can be combined with numerical techniques in order to better handle noise and uncertainty in the real world.

 

Friday Workshops (28th September 2012)

 

SDAD: Sentiment Discovery from Affective Data

Mohamed Medhat Gaber, Mihaela Cocea, Stephan Weibelzahl, Ernestina Menasalvas and Cyril Labbe

Website

The current expansion of social media leads to masses of affective data related to peoples’ emotions, sentiments and opinions. Knowledge discovery from such data is an emerging area of research in the past few years, with a potential number of applications of paramount importance to business organisations, individual users and governments. Data mining and machine learning techniques are used to discover knowledge from various types of affective data such as ratings, text or browsing data. Although research in this area has grown considerably in the recent years, knowledge discovery from affective data is in its infancy state with more open issues and challenges which often require interdisciplinary approaches. This workshop aims to bring together researchers in this area to present their latest work, to discuss the challenges in the field and identify where our efforts, as a research community, should focus.

 

ALRA:Active Learning in Real-world Applications

Laurent Candillier, Max Chevalier and Vincent Lemaire

Website

Machine learning indicates methods and algorithms which allow a model to learn a behavior thanks to examples. Active learning gathers methods which select examples used to build a training dataset for the predictive model. All the strategies aim to use a set of examples as small as possible and to select the most informative examples.

When designing active learning algorithms for real-world data, some specific issues are raised. The main ones are scalability and practicability. Methods must be able to handle high volumes of data, and the process for labeling new examples by an expert must be optimized.

We encourage papers that describe applications of active learning in real-world. The industrial context, the main difficulties met and the original solution developed, shall be described. Contributions on the associated Nomao challenge (http://www.nomao.com/labs/challenge), that proposes such a practical application of active learning, will also be welcome.

 

I-Pat: Mining and exploiting interpretable local patterns

Henrik Grosskreutz, Stefan Rüping and Nikos Karacapilidis

Website

Local patterns, like itemsets, correlations, contrast sets or subgroups, stand out from other data mining tools by their descriptive nature, which makes them directly interpretable by end users like clinicians, fraud experts or analysts. In this workshop, we wish to investigate typical use cases and key requirements for the successful usage of local pattern mining in applications where next to the statistical performance of models, the understandability and interestingness of the models is the key success factor.

 

COMMPER: Community Mining and People Recommenders

Panagiotis Papapetrou, Jaakko Hollmen and Luiz Augusto Pizzato

Website

Data mining and knowledge discovery in social networks has advanced significantly over the past several years, due to the availability of a large variety of offline and online social network systems. The focus of COMMPER 2012 is on social networks with special focus on community mining and people recommenders. Community minding involves topics such as the analysis of scientific communities and collaboration networks, including bibliometrics, and the formation of teams. People recommenders focus on the all topics where recommender systems are used to enable connections among users, such systems can be found on all types of social networks such as photo sharing websites, expert search, mentoring systems and online dating..

 

CoLISD: Collective Learning and Inference on Structured Data

Balaraman Ravindran, Kristian Kersting, Sriraam Natarajan, S. Shivashankar

Website

Classical ML techniques assume the data to be iid, but the real world data is inherently relational and can generally be represented using graphs or some variants of them. The importance of modelling structured data is evident from its increasing presence: WWW, social networks, organizational network, image, protein sequence, relational data etc. This field has been recently receiving a lot of attention in the community under different themes depending on the problem addressed and the nature of solution. Variants include iterative classification, structured prediction, relational learning, etc. While there are other issues such as learning the network structure, CoLISD focuses on the within-network learning and inference tasks with special emphasis on collective inference.

 

Discovery Challenge: Third Challenge on Large Scale Hierarchical Text Classification

Ion Androutsopoulos, Thierry Artieres, Patrick Gallinari, Eric Gaussier, Aris Kosmopoulos, George Paliouras, Ioannis Partalas

Website

This year’s discovery challenge hosts the third edition of the successful PASCAL challenges on large scale hierarchical text classification. The challenge comprises three tracks and it is based on two large datasets created from the ODP web directory (DMOZ) and Wikipedia. The datasets are multi-class, multi-label and hierarchical. The number of categories ranges between 13,000 and 325,000 roughly and the number of documents between 380,000 and 2,400,000.