Machine Learning for Data Streams


MOA is the most popular open source framework for data stream mining, with a very active growing community (blog). It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation. Related to the WEKA project, MOA is also written in Java, while scaling to more demanding problems. A new book on MOA has been published at MIT Press.

MOA can be extended with new mining algorithms, and new stream generators or evaluation measures. The goal is to provide a benchmark suite for the stream mining community.

News: Launch of CapyMOA, a new fast streaming library in Python.

Machine learning library tailored for data streams. Featuring a Python API tightly integrated with MOA (Stream Learners), PyTorch (Neural Networks), and scikit-learn (Machine Learning). CapyMOA provides a fast python interface to leverage the state-of-the-art algorithms in the field of data streams.

https://capymoa.org/

MOA New Release: 24.07 (July, 2024)!

Citing MOA

If you want to refer to MOA in a publication, please cite the following JMLR paper:
Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010); MOA: Massive Online Analysis; Journal of Machine Learning Research 11: 1601-1604 | BibTeX

Related Open Source Software

      • CapyMOA, a fast new library for online machine learning in Python.
      • RIVER, a framework for stream mining in Python.
      • streamDM for Spark Streaming, a new framework for Spark.
      • Apache SAMOA , a new framework for distributed stream mining, can be easily used with Apache Flink, Apache Storm, S4, or Samza.
      • streamDM C++ , a framework in C++ for data stream mining.
      • ADAMS, a novel, flexible workflow engine, is the perfect tool for maintaining MOA real-world, complex knowledge workflows.
      • The MEKA project provides an open source implementation of methods for multi-label classification and evaluation.