
Chinese: 
大家好！上节课我们学习了训练和测试。
我们学习了通过独立的测试数据评估分类器；或者通过百分比分割，
把数据集分为训练和测试数据；
或者(这是一个非常不好的做法)使用训练数据来自我评估，
这会导致过于乐观的结果。
这节课，我们将继续拓展训练和测试的内容。

English: 
Hello again! In the last lesson, we looked
at training and testing.
We saw that we can evaluate a classifier on
an independent test set, or using a percentage split,
with a certain percentage of the dataset
used to train and the rest used for testing,
or -- and this is generally a very bad idea -- we
can evaluate it on the training set itself,
which gives misleadingly optimistic performance
figures.
In this lesson, we're going to look a little
bit more at training and testing.

Chinese: 
实际上，我们要通过使用百分比分割来重复训练和测试的操作。
上节课中我们看到，如果简单重复训练和测试，
只会得到同样的结果，因为每次运行Weka都会初始化随机数生成器，
这样可以保证当你明天做同样的测试时，也会得到相同的
结果。
但是，我们也可以重新设置。
我们可以用不同的随机数来
把数据集分割为训练和测试数据。
载入segment-challenge数据集。
我们之前已经用过这个数据集。
可以看到这里有1500个实例，
非常多。
点击Classify，
选择J48，我们最常用的分类器。

English: 
In fact, what we're going to do is repeatedly
train and test using percentage split.
Now, in the last lesson, we saw that if you
simply repeat the training and testing, you
get the same result each time because Weka
initializes the random number generator before
it does each run to make sure that you know
what's going on when you do the same experiment
again tomorrow.
But, there is a way of overriding that.
So, we will be using independent random numbers
on different occasions to produce a percentage
split of the dataset into a training and test
set.
I'm going to open the segment-challenge data
again.
That's what we used before.
Notice there are 1500 instances here;
that's quite a lot.
I'm going to go to Classify.
I'm going to choose J48, our standard method,
I guess.

English: 
I'm going to use a percentage split, and because
we've got 1500 instances, I'm going to choose
90% for training and just 10% for testing.
I reckon that 10% -- that's 150 instances -- for
testing is going to give us a reasonable estimate,
and we might as well train on as many as we
can to get the most accurate classifier.
I'm going to run this, and the accuracy figure
I get -- this is what I got in the last lesson --
is 96.6667%.
Now, this is misleadingly high accuracy here.
I'm going to call that 96.7%, or 0.967.
And then, I'm going to do it again and just
see how much variation we get of that figure
initializing the random number generator
to different amounts each time.
If I go to the More options menu, I get a
number of options here which are quite useful:

Chinese: 
我要用percentage split。因为我们有多达1500个实例，我要
设定90%为训练数据，10%为测试数据。
10%，150个实例，足够我们做合理的评估。
我们可以尽可能多地运行以找到更为精确的分类器。
点击开始，得到了和上次课中一样的精确度，
96.6667%，
一个误导性的高精确度。
也可以说是96.7%，或0.967。
我要再做一次，看看每次初始化随机数生成器
为不同数字的不同结果。
打开More options菜单，这里有很多有用的选项：

Chinese: 
Output model,
Output statistics,
Output evaluation measures,
Output confusion matrix,
Store predictions for visualization,
Output presictions,
Cost-sensitive evaluation。
我们还可以设置随机种子，做交叉验证或百分比分割。
随机种子的默认值为1。
改为2，不同的随机种子。
我们还可以输出分类器的源代码，
但现在我只想更改随机种子。
再次运行。
之前，得到的准确度是0.967，这次是0.94，94%。
可以看出，很不同。
然后再改为3，再次运行。
又得到94%。

English: 
outputting the model, we're doing that;
outputting statistics;
we can output different evaluation measures;
we're doing the confusion matrix;
we're storing the prediction for visualization;
we can output the predictions if we want;
we can do a cost-sensitive evaluation;
and we can set the random seed for cross-validation
or percentage split.
That's set by default to 1.
I'm going to change that to 2, a different
random seed.
We could also output the source code for the
classifier if we wanted, but I just want to
change the random seed.
Then I want to run it again.
Before we got 0.967, and this time we get 0.94,
94%.
Quite different, you see.
If I were then to change this again to, say,
3, and run it again.
Again I get 94%.

English: 
If I change it again to 4 and run it again,
I get 96.7%.
Let's do one more.
Change it to 5, run it again, and now I get
95.3%.
Here's a table with these figures in.
If we run it 10 times, we get this set of
results.
Given this set of experimental results, we
can calculate the mean and standard deviation.
The sample mean is the sum of all of these
error figures -- or these success rates, I should say -- 
divided by the number, 10 of
them.
That's 0.949, about 95%.
That's really what we would expect to get.
That's a better estimate than the 96.7% that
we started out with.
A more reliable estimate.
We can calculate the sample variance.
We take the deviation from the mean, we subtract
the mean from each of these numbers, we square that,

Chinese: 
改为4，运行，得到96.7%。
再试最后一次。
改为5，运行，得到95.3%。
这就是包含这些准确率的表格。
如果我们运行10次，就会得到这样一组数据。
有了实验数据，就可以计算平均值和标准差。
样本的平均值是所有的准确率 
除以样本数，也就是10。
得到0.949，接近95%。
这是我们期望得到的数值，
比之前得到的96.7%好得多。
这是一个较为可信的估计。
我们可以计算样本的标准差。
样本方差等于构成样本的这些数值减去平均值的

Chinese: 
平方和，除以n-1，而不是n。
你也许很难理解这点。
使用n-1做除数，是因为我们已经计算了样本的平均值。
n-1做除数，相比n而言，
会得到较大的样本方差。
标准差是方差的算术平方根，这里等于1.8%。
现在我们可以看到对于数据集segment-challenge，
J48可以得到约95%的准确率，加或减大约2%,
即93%到97%的准确率。
你会发现，从Weka中得到的数据有误导性。

English: 
add them up, and we divide, not by n,
but by n - 1.
That might surprise you, perhaps.
The reason for it being n - 1 is because we've
actually calculated the mean from this sample.
When the mean is calculated from the sample,
you need to divide by n - 1, leading to a slightly larger
variance estimate than if you were to divide
by n.
We take the square root of that, and in this
case, we get a standard deviation of 1.8%.
Now you can see that the real performance
of J48 on the segment-challenge dataset is
approximately 95% accuracy, plus or minus
approximately 2%.
Anywhere, let's say, between 93-97% accuracy.
These figures that you get, that Weka puts
out for you, are misleading.

English: 
You need to be careful how you interpret them,
because the result is certainly not 95.333%.
There's a lot of variation on a lot of these
figures.
Remember, the basic assumption is the training
and test sets are sampled independently from
an infinite population, and you should expect
a slight variation in results -- perhaps more
than just a slight variation in results.
You can estimate the variation in results
by setting the random-number seed and repeating
the experiment.
You can calculate the mean and the standard
deviation experimentally, which is what we
just did.
Off you go now, and do the activity associated
with this lesson.
I'll see you in the next lesson.
Bye!

Chinese: 
你需要小心地解读它，结果当然不是95.333%。
这些数据会有很大的变化。
记住，最基本原理是训练数据和测试数据是从相同的无限总体中
取出的独立样本。结果自然会有误差，
也许不只是微小的误差。
我们通过来设定随机种子和重复实验来
估算误差。
我们还可以通过实验计算平均值和标准差。
也就是我们刚才做的。
现在，课程结束了。请大家完成课程的相关练习。
下次课见！
再见！
