scikit-multiflow: machine learning for data streams in Python

scikit-multiflow is an open-source machine learning package for streaming data. It extends the scientific tools available in the Python ecosystem. scikit-multiflow is intended for streaming data applications where data is continuously generated and must be processed and analyzed on the go. Data samples are not stored, so learning methods are exposed to new data only once.

You can follow the development of scikit-multiflow in the GitHub repository.

scikit-multiflow is part of the stream learning ecosystem. Other tools include MOA, the most popular open-source machine learning framework for data streams, and MEKA, an open-source implementation of methods for multi-label learning. Both MOA and MEKA are written in Java.

In Python, scikit-multiflow complements packages such as scikit-learn, whose primary focus is batch learning.

New release v0.5 is now available

This release includes support for delayed labels in supervised learning, new methods for classification, regression, drift detection, and more.

See the highlights.