Hello! Welcome back to Class 2: Data stream
mining in Weka and MOA.
In this lesson, we're going to look at Weka's
MOA package.
Let's start.
What is MOA? MOA is an open source software
that is specifically designed for mining data
streams.
The most important thing of MOA is that it
can handle evolving data streams, data streams
that are changing, data streams with concept
drift.
MOA has many methods--from classification,
regression, clustering, frequent pattern mining,
outlier detection, concept drift--and is very
easy to use and very easy to extend.
MOA can be used alone or can be used with
Weka or can be used with ADAMS and with MEKA.
ADAMS is a very nice workflow engine where
you can develop your workflows, and MEKA is
a software specific for multi-label learning.
MOA runs on a single computer.
In the case that you need to data stream mining
in a cluster of computers, like the ones that
are used in the Hadoop ecosystem, then Apache
SAMOA could be the right tool for you.
SAMOA allows you to run stream mining jobs
on Apache Storm, Apache S4, Apache Samza,
and nowadays Apache Flink.
As you know, New Zealand is very famous for
birds, so the weka is a native bird of New Zealand.
The moa is also a native bird of New Zealnd.
It's also flightless as the weka, but nowadays
is extinct.
As you can see in these pictures, the moa
was a very large bird. If you compare it here
with the size of a weka or a kiwi, we see
that the moa was really large, or here comparing
with the size of a human person.
Let's see how to install the MOA package in
Weka.
We're going to go to Tools.
We're going to open the Package manager.
From there, we are going to look for the Massive
Online Package.
Let's look for it.
We should go to the "m".
MassiveOnlineAnalysis.
Then we install it.
We say "Yes". OK.
And once it's installed, we go to the Explorer,
and then we can see the objects of MOA inside Weka.
So, there are two types.
One is the data generators and the other are
the classifiers.
For example, if we went to generate data now,
we can look at the MOA generator, and from
the MOA generator we can access all the data
stream generators inside MOA.
Okay. Let's choose one for example.
I will generate some data.
The other important thing is that we can have
access to all the MOA classifiers.
So, we go inside classifiers inside meta we'll
find this MOA classifier, and from this MOA
classifier, we can get access to all the classifiers
in MOA.
Here are all the classifiers in MOA.
Let's choose, for example, a meta classifier.
We can choose the online bagging that's called
OzaBag.
And that's all.
In this lesson, we have seen how to install
the MOA package in Weka.
We have seen that MOA is open source software
specifically designed for data stream mining.
It handles evolving data streams, and it has
many, many different methods for clustering,
classification, regression, frequent pattern
mining, outlier detection, and concept drift.
It is very easy to use and very easy to extend.
See you in the next lesson!
