New release of MOA 12.08

We’ve made a new release of MOA 12.08. The new features of this release are: new rule classification methods : VFDR Rules from Learning Decision Rules from Data Streams, IJCAI 2011, J. Gama, P. Kosina migrated to proper maven project NaiveBayesMultinomial and SGD updated with adaptive DoubleVector for weights new multilabel classifiers: Scalable and efficient […]

CFP – Data Streams Track – ACM SAC 2013

============================================================ ACM SAC 2013 The 28th Annual ACM Symposium on Applied Computing in Coimbra, Portugal, March 18-22, 2013. DATA STREAMS TRACK ============================================================ CALL FOR PAPERS The rapid development in information science and technology in general and in growth complexity and volume of data in particular has introduced new challenges for the research community. […]

Summer School on Massive Data Mining, August 8-10, 2012

August 8-10, 2012, IT University of Copenhagen, Denmark The summer school is aimed at PhD students and young researchers both from the algorithms community and the data mining community. A typical participant will be working in a group that aims at publishing in algorithms conferences such as ESA and SODA, and/or in data mining conferences […]

Big Data Mining (BigMine-12)

Call for Papers Big Data Mining (BigMine-12)1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (BigMine-12) – a KDD2012 Workshop KDD2012 Conference Dates: August 12-16, 2012BigMine-12 Workshop Date: Aug 12, 2012Beijing, China Key dates: Papers due: May 9, 2012 Acceptance notification: May 23, 2012Workshop Final Paper […]

New release of MOA 12.03

We’ve made a new release of MOA 12.03. The new features of this release are: new measure graphic visualization for classification Classifiers are now in subpackages: classifiers.tree, classifiers.bayes, classifiers.functions, classifiers.meta, classifiers.drift,… HoeffdingTree, HoeffdingTreeNB, and HoeffdingTreeNBAdaptive are now only one classifier: HoeffdingTree with an option to select how to do the classification at leaves. By default, […]

PRICAI 2012 Special Session on Scalable Big Data Mining September 3 – 7, 2012 Kuching, Sarawak, Malaysia ==============================================  CALL FOR PAPERS Data have become a torrent flowing in many important areas. Big data refers to datasets whose size is beyond the ability of current state-of-the art analytic tools. Streaming data is an specific approach to deal with big data that is evolving and […]

Upcoming Conference: “Machine-Learning with Real-time & Streaming Applications”

FIRST CONFERENCE ANNOUNCEMENT: From Data to Knowledge: Machine-Learning with Real-time & Streaming Applications May 7-11 2012 On the Campus of the University of California, Berkeley  * * CONFIRMED INVITED SPEAKERS * * Olfa Nasraoui (Louisville), Petros Drineas (RPI), Muthu Muthukrishnan (Rutgers), Alex Szalay (John Hopkins), David Bader (Georgia Tech), Eamonn Keogh (UC Riverside), Joao […]

IBLStreams (Instance Based Learner on Streams for Regression and Classification)

IBLStreams (Instance Based Learner on Streams) is an instance-based learning algorithm for classification and regression problems on data streams by Ammar Shaker, Eyke Hüllermeier and Jürgen Beringer. The method is able to handle large streams with low requirements in terms of memory and computational power. Moreover, it disposes of mechanisms for adapting to concept drift […]

New release of MOA 11.10

We’ve made a new release of MOA 11.10. The new features of this release are: new active classification methods : ActiveClassifier Cluster Mapping Measure CMM cleanup of Clustering Setup Panel export fix for FileStream based clusterings screenshot button: filename option wrapper for Weka Clustering algorithms You find the download link for this release on the […]

Pocket Data Mining Project using MOA

Pocket Data Mining PDM is a new term describing collaborative mining of streaming data in mobile and distributed computing environments by researchers Frederic Stahl, Mohamed Medhat Gaber, Max Bramer, and Philip S. Yu. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data […]

CFP – Data Streams Track – ACM SAC 2012

DATA STREAMS TRACK ======================================================================== ACM SAC 2012 The 27th Annual ACM Symposium on Applied Computing in Trento University, Italy, March 20-23, 2012. ======================================================================== CALL FOR PAPERS The rapid development in information science and technology in general and in growth complexity and volume of data in particular has introduced new challenges for the research community. […]


HaCDAIS 2011: The 2nd International Workshop on Handling Concept Drift in Adaptive Information Systems CALL FOR PAPERS  In the real world data is often non stationary. In predictive analytics, machine learning and data mining the phenomenon of unexpected change in underlying data over time is known as concept drift. Changes in underlying data might […]

New release of MOA 11.5

We’ve made a new release of MOA 11.5. The new features of this release are:    new classification methods for text and sparse data: NaiveBayesMultinomial, SGD Stochastic Gradient Descent, and SPegasos.  new classification methods: LimAttClassifier, LimAttHoeffdingTree, LimAttHoeffdingTreeNB, LimAttHoeffdingTreeNBAdaptive, Perceptron.   new chunk classification and evaluation methods: EvaluateInterleavedChunks, AccuracyUpdatedEnsemble, AccuracyWeightedEnsemble.  new regression evaluation methods. Now it is […]

ADMIRE project

ADMIRE project Website MOA is used in the Advanced Data Mining and Integration Research for Europe (ADMIRE) project. The aim of the project is to create advanced, distributed data analysis platform, where one of the major goals is to provide ability of data stream processing. MOA has been very helpful during development of one of […]

Book “Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams”

¬† Albert Bifet This book addresses the design of learning algorithms for¬†mining time-changing data streams. It introduces new contributions on several different aspects of the problem, identifying research opportunities and increasing the scope for applications. It also includes an in-depth study of stream mining and a theoretical analysis of proposed methods and algorithms. The […]

pHMM4weka: Profile Hidden Markov Models (PHMMs) for binary protein classification for WEKA

This Java software implements Profile Hidden Markov Models (PHMMs) for binary protein classification for the WEKA workbench. Standard PHMMs and newly introduced binary PHMMs are used. In addition the software allows propositionalisation of PHMMs. This software was developed by Stefan Mutter during his PhD at the Machine Learning Group at University of Waikato. His thesis investigated similarity […]

The ClusTree: indexing micro-clusters for anytime stream mining

Knowledge and Information Systems, 2010 by Philipp Kranen, Ira Assent, Corinna Baldauf, and Thomas Seidl. Abstract: Clustering streaming data requires algorithms that are capable of updating clustering results for the incoming data. As data is constantly arriving, time for processing is limited. Clustering has to be performed in a single pass over the incoming data […]

KeplerWeka: a module for Kepler providing the functionality of WEKA

KeplerWeka is a module for the open-source scientific workflow Kepler providing the full functionality of the WEKA Machine Learning workbench. It is developed by Peter Reutemann at the Machine Learning Group of the University of Waikato. The last release of KeplerWeka is integrated into the new Kepler 2.x build framework. Kepler is designed to help scien­tists, analysts, and […]

Third Edition “Data Mining: Practical Machine Learning Tools and Techniques”

By Ian H. Witten, Eibe Frank and Mark A. Hall Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on […]

S4: Distributed Stream Computing Platform from Yahoo!

S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. S4 was initially developed to personalize search advertising products at Yahoo!, which operate at a rate of thousands of events per second. MapReduce excels at batch jobs, but is hard to apply to stream […]