
Chinese: 
大家好！这是1.4课。我们将要学习知识流界面（Knowledge Flow Interface）。
Knowledge Flow Interface可以替代Explorer，并且可以
在2D设计画布上交互式地安排过滤器，分类器和评估器。
其他可用的组件有数据源，可视化组件，
等等。
我们可以通过不同的方式连接各组件，
Knowledge Flow Interface可以递增地处理可能的无限的数据流。
让我们开始配置Knowledge Flow Interface。

English: 
Hello again! This is Lesson 1.4, and we're
going to look at the Knowledge Flow Interface.
The Knowledge Flow Interface is an alternative
to the Explorer, and it lets you lay out filters,
classifiers, and evaluators interactively
on a 2D canvas.
There are various other components like data
sources and visualization components, and
so on.
We have different kinds of connections between
the components, and a feature of the Knowledge
Flow Interface is that it can work incrementally
on potentially infinite data streams.
So, let's go ahead and set up a configuration
in the Knowledge Flow Interface.

Chinese: 
我们从这里开始。
用名为ARFF加载器的数据源加载ARFF文件。
开始设置，单击鼠标右键，Configure
使用iris数据集。
然后，我们用ClassAssigner指定类属性。
ClassAssigner在这里。
我们可以建立一个连接，将ClassAssigner和数据集连接起来。
然后，打开Cross-Validation Fold Maker，因为我们将
进行交叉验证。
我们把数据集和Cross-Validation Fold Maker连接起来。
然后，选择一个分类器。
这里我们使用大家熟悉的J48。
这里是所有的分类器。

English: 
I'll just start it up here.
I'm going to load an ARFF file with a DataSource
called an ARFF Loader.
I'm going to configure that - this is a right
click, Configure - to use the iris dataset,
which is here.
Then I'm going to need a Class Assigner to
assign the class.
That's here - Class Assigner.
I can make a connection, and I'm going to
make a Dataset connection to the Class Assigner.
Then I'm going to get a Cross-Validation Fold
Maker, because we're going to evaluate this
with cross-validation.
I'm going to connect up the dataset to the
CrossValidation Foldmaker.
Then I'm going to get a classifier.
I'll use good old J48.
Here are all of the classifiers.

English: 
J48 is up here with the tree classifiers at
the end.
Let me put that there.
I'm going to connect both the Training Set
and the Test Set from the CrossValidation
Foldmaker to J48.
I'm going to get a Classifier Performance
Evaluator in the Evaluation tab.
I'm going to connect the classifier - that
is the batch classifier produced by J48 -
to this, and I'm going to connect the output
to a Text Viewer.
Here's a Text Viewer, the textual output.
Then I'm going to start it all up.
I'm going to run it.
With my right-click here, I'm going to Start
Loading.
Let's have a look at this Text Viewer; right-click
to show the results.
Here we go.
These are the results that we've got.

Chinese: 
J48在这儿，树形分类器在最后。
放到那儿。
从Cross-Validation Fold Maker到J48，
连接起训练集和测试集。
我们将在评估目录中拿到分类器的评估器。
把它和由J48产生的批分类器
连接，然后将输出连接到TextViewer。
这是TextViewer，用于文本输出。
Then I'm going to start it all up.
然后，我们开始运行。
运行。
鼠标右键单击这里，开始加载。
让我们看看TextViewer，右击显示结果。
好了。
这些就是我们得到的结果。

Chinese: 
当然，我们已经看过很多次这样的运行结果了。
我的课件上还有很多不同的内容。
这就是我们刚刚做的。
这是我们用到的设置。
然后，我们增加一个模型性能图表（Model Performance Chart)。
让我们找找看。
应该在可视化目录的下面。
这里是我们要的Model Performance Chart。
我们把Visualizable Error连接到这里。
然后，看一下输出。
再一次运行（开始加载）。
现在回到输出界面（选择Show Chart）。
这里，当然，你之前见到过这种图。
我们可以制图，例如，用预测的类对比的实际的类。

English: 
Well, we've seen these results before many
times, of course.
There are a lot of different things back on
my slide here.
This is what I've done.
Here's the configuration I set up.
Next, I'm going to add a Model Performance
Chart.
Let's find that.
That would be under Visualization.
Here's our Model Performance Chart.
I'm going to connect the Visualizable Error
to this.
Then I'm going to have a look at the output.
Let me just run this again (Start Loading).
Now I'm going to look at the output (Show
Chart).
Here -- well, you've seen this kind of chart
before.
I could plot, for example, the predicted class
against the actual class.

English: 
There are a lot of different things you could
do.
Back on the slide here: let's work with stream
data.
I'm going to take an ARFF loader in stream
mode -- not load a dataset, but a single instance
at a time.
We're going to use an updateable classifier,
an incremental evaluator, and look at a Strip
Chart.
We clear all of this over here.
Select "Data Source".
Let's get that ARFF loader going, and configure
it to use the iris data.
Then I'm going to take that to a Class Assigner,
which is in Evaluation.
This time I'm going to make an instance connection:
I'm just going to send a single instance along
here.
And I'm not going to make cross validation
folds; I'm going to take that straight to
an updateable classifier.
There's an updateable version of NaiveBayes.

Chinese: 
这里你可以尝试许多不同的选项。
回到课件：让我们开始处理数据流。
我们使用ARFF加载器的数据流模式-- 并不加载数据集，而是
每次加载一个实例。
我们来用一个可更新的分类器，一个递增评估器
和条形图。
我们清除一切。
选择Data Source。
开始配置ARFF加载器，加载iris数据。
然后，使用评估目录下的ClassAssigner。
建立一个实例连接：这样只读取一个
单独的实例。
这次我们不用Cross-Validation Fold Maker；而是直接连到
用一个可更新分类器。
这是一个NaiveBayes.的可更新版本。

English: 
Some classifiers are updateable and some aren't.
NaiveBayes Updateable, let's use that one.
I'm going to connect that instance here to
the updateable NaiveBayes classifier.
Then I'm going to use an Incremental Classifier
Evaluator.
It's an incremental classifier that I'm going
to connect up to this.
Now I'm going to take the output from that
and put it on a Strip Chart.
Here's a Strip Chart.
Take the output here to the chart I picked
and put it there.
Okay.
Let's show the Strip Chart, which is blank
at the moment.
Then with my ARFF Loader, I will Start Loading.
You can see a little bit of output here.

Chinese: 
有些分类器是可更新的，有些不是。
NaiveBayes是可以更新的，我们就用它。
把实例连接到可更新的NaiveBayes分类器。
然后，我们用递增分类器处理iris数据。
这是我们要的递增分类器。
现在我要把分类器的输出导入一个条形图。
这是条形图。
选取输出，导入选定的条形图，并且放到那里。
好了。
显示一下条形图， 现在这里还是空白的。
然后用ARFF加载器，开始加载。
在这里，你可以看到少量的输出。

English: 
I'm going to use a larger dataset.
I could configure this, of course, but the
simplest thing is to use a larger dataset.
Let me use the segment-challenge dataset and
start loading again.
Now we get this kind of output.
This shows you how the class probabilities
change for one class and for the other class
as we go through.
These are effectively learning curves in this
situation.
We've looked at the Knowledge Flow Interface.
The panels are broadly similar to the Explorer's
with some exceptions.
Evaluation is a separate panel, for example.
The facilities are broadly similar, as well,
with just a couple of notable exceptions.
We can deal incrementally with potentially
infinite datasets.
That's what we just did — the configuration
we just set up loaded from the file incrementally,
so it was never stored in memory at the same
time, which is what the Explorer does.

Chinese: 
我们将使用一个更大的的数据集。
当然我们可以设置这些，但是用一个更大的数据集是最简单的方法。
让我们用segment-challenge数据集，重新加载数据。
现在我们得到了新的输出。
这显示了不同类别之间，概率是如何变化的。
as we go through.
随着数据的逐渐加入。
这些是所产生的学习曲线。
我们学习了Knowledge Flow Interface（知识流界面）。
它的控制板和Explorer的相似，只有些许例外。
评估是一个单独的控制板，例如：
功能也（和Explorer的）相似，但有几个显著的不同。
我们可以递增地处理可能的无限的数据集。
那就是我们刚刚所做的——我们通过配置逐步加载文件，
所以，数据并不像Explorer一样全保存在内存里。

Chinese: 
Explorer把所有加载的东西都放到内存里。
另外，你可以看到交叉验证每个折的结果。
这正是本课的练习的内容。
有的人会喜欢这样的图形界面，这对于学习使用
Knowledge Flow Interface很有帮助。
开始练习吧。
祝大家好运。我们下次见！
再见！

English: 
The Explorer loads everything into memory.
Also, you can look inside cross-validation
at the models for individual folds, which
is exactly what you're going to be doing in
the activity associated with this lesson.
Some people really like graphical interfaces
like this, and it's really good to know about
the Knowledge Flow Interface.
Off you go to the activity.
Good luck, and I'll see you in the next lesson.
Bye for now!
