MEKA Software: A Multi-label Extension to the WEKA Framework

This software provides an open source implementation of the `pruned sets’ and `classifier chains’ methods for multi-label classification. These methods were developed during the PhD Thesis of Jesse Read at the Machine Learning Group at University of Waikato. See these publications:

Jesse Read. Scalable Multi-label Classification. PhD Thesis, University of Waikato, Hamilton, New Zealand. (2010)

Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank. Classifier Chains for Multi-label Classification. In Proc. of 20th European Conference on Machine Learning (ECML 2009). Bled, Slovenia, September 2009.

Jesse Read, Bernhard Pfahringer, Geoff Holmes. Multi-label Classification using Ensembles of Pruned Sets. Proc. of IEEE International Conference on Data Mining (ICDM 2008). Pisa, Italy, December 2008.

Website

Book “Knowledge Discovery from Data Streams” from João Gama

This book covers the fundamentals of data stream mining and describes important applications, such as TCP/IP traffic, GPS data, sensor networks, and customer click streams. It also addresses several challenges of data mining in the future, when stream mining will be at the core of many applications. These challenges involve designing useful and efficient data mining solutions applicable to real-world problems. In the appendix, the author includes examples of publicly available software and online data sets. This practical, up-to-date book focuses on the new requirements of the next generation of data mining. Although the concepts presented in the text are mainly about data streams, they also are valid for different areas of machine learning and data mining.

https://www.crcpress.com/product/isbn/9781439826119

PAKDD 2011 Tutorial: Handling Concept Drift: Importance, Challenges and Solutions

Tutorial at PAKDD discussing concept drift, and MOA as an open source software to deal with concept drift.

Abstract: In the real world data often arrives in streams and is evolving over time. Concept drift in supervised learning means that the underlying distribution of the data is changing. As a result the predictions might become less accurate as the time passes, or opportunities to improve the accuracy might be missed. Therefore, the learning models need to adapt to changes quickly and accurately. The proposed tutorial aims to provide a unifying view on the basic and applied concept drift research in data mining and related areas. In the first part we will introduce the problem of concept drift, discuss why changes appear in supervised learning and motivation to handle them. We will overview what types of application tasks are available. In the second part we will present available approaches and techniques to handle concept drift, discuss evaluation issues and open source software. In the third part we will reflect on the past, present and future of concept drift research and outline future research directions. We will focus on the link between research scenarios and application needs.

Presenters:

  • Albert Bifet, University of Waikato, New Zealand
  • João Gama, University of Porto, Portugal
  • Mykola Pechenizkiy, Eindhoven University of Technology, Netherlands
  • Indrė Žliobaitė, Eindhoven University of Technology, the Netherlands

Tutorial website

The moa, NZ national symbol

The fame of the moa and the fact that its size made it a world-beater gave it the brief status of national symbol briefly in the 19th century. In the 1890s, New Zealand was ‘the land of the moa’, and of 103 entries for a new national coat of arms in 1906–8, 28 included moa. Moa also featured on commercial logos, and in cartoons to represent New Zealand. Its iconic status did not last, however, and was soon replaced by the kiwi.

The moa and the lion.  The fame of the moa’s size briefly turned it into a national symbol. This postcard was issued in 1905 to represent the extraordinary success of the New Zealand All Black rugby team during its tour of England that year.

More information.

 

Cooperative Cars

MOA is used as a data stream mining framework in the Cooperative Cars (CoCar) Project, a joint project between Ericsson in Aachen and Fraunhofer FIT. The CoCar project is aiming at basic research for C2C and C2I communication for future cooperative vehicle applications using cellular mobile communication technologies. Five partners out of the telecommunications- and automotive industry develop platform independent communication protocols and innovative system components. They will be prototyped, implemented and validated in selected applications. Innovation perspectives and potential future network enhancements of cellular systems for supporting cooperative, intelligent vehicles will be identified and demonstrated.

https://dbis.rwth-aachen.de/cms/projects/CoCar