Hello! Welcome back to data stream mining
in Weka and MOA.
In today's lesson, we're going to look at
the MOA interface.
MOA can be used in three different ways, using
the graphical user interface, the command
line, or the Java API.
Let's start with classification evaluation.
In batch setting, we have two different types
of evaluation: holdout, when we have different
data for testing and training, or 10-fold
cross-validation, when we are using the same
data for testing and training.
In the incremental setting, what we have is
that we have two types: holdout evaluation
and also prequential evaluation.
Let's look at these two types of evaluation.
In the holdout evaluation, what we are doing
is that we are training our model one instance
by one instance and then, periodically, we are doing an evaluation testing using different instances.
In the prequential evaluation, what we are
doing is that we are using the same data for
testing and training.
In that sense, what we are doing is that we
are testing and training every one of the instances
of the stream.
Every time a new instance arrives, first we
test and then we train.
Let's look at the MOA interface.
First, we're going to download the software,
so let's go to the MOA webpage.
From the MOA webpage, what we are going to
do is go to Downloads, and from there, we
are going to download the last release.
Okay. Once we have the--once MOA is downloaded,
we can run it from the bin folder if it's
in Windows using moa.bat and, if not, using
moa.sh.
Let's run it.
What we see is that we have several tabs.
One is for classification, the other is for
regression, also for clustering, outliers,
and concept drift.
Let's start with classification.
Let's run a task.
We're going to run an evaluation task.
Let's start with a holdout evaluation, with
EvaluatePeriodicHeldOutTest.
We need to specify the learner, in this case
it's going to be the HoeffdingTree.
What's the stream?
In this case, we're going to select the HyperplaneGenerator.
Okay, and then how many instances we want
to use for testing, in that case we say that
we want to use 1000 instances and we want
to train 1,000,000 instances.
And we want to see the results every 10,000
instances.
Okay. That's the definition of the task.
We see that it's here specified, EvaluatePeriodicHeldout,
and then we run it.
We see that here, we have all the results.
And here we see that there's a plot of these
results where we have also the different measures,
like accuracy, kappa.
Okay, now let's run a prequential evaluation.
Again, we change the task.
We're going to change to EvaluatePrequential.
We're going to define again what's the learner,
in this case, it's going to be the HoeffdingTree.
Okay. Then the stream, we're going to select
the HyperplaneGenerator.
Okay, and then we're going to train 1,000,000
instances, and we're going to look at the
results every 10,000 instances.
Now we run the task.
Here we see the results, and here we see the
evolution of these measures, and now the
nice thing is that we can compare both.
If we look at this, we see that one appears
in red and the other appears in blue.
We can take a look at that and we can also
zoom it to look at it in more detail.
Another way to use MOA is using the command
line.
We can reuse the command line that we have
in the graphical user interface, when we were
selecting what was the task that we want to
run.
We can use the same text, and we can put it
inside the command line.
What we are doing then is that we are executing
the task using this moa.DoTask.
Then, we need only to specify what is the
task, what is the learner that we want to
use, what is the stream we want to use, how
many instances we want to use.
In this lesson, we have seen how to use the
MOA interface.
We know that there are three different ways.
We have the graphical user interface, the
command line, and the Java API.
Also, we have seen the two types of evaluation
for incremental learning.
That is the holdout evaluation and the prequential
evaluation.
See you later!
