
English: 
Hello! my name is Ian Witten, and I'm from 
the University of Waikato here in New
Zealand, and I want to tell you about our new
free, online course—Data Mining with Weka. 
We're overwhelmed by data in the world today.
Every time we check out an item at the supermarket,
every time we swipe our credit card,
every time we send an email, 
every time we type a keystroke on our keyboard. Every time we 
make a phone call, send a text, walk past
a security camera. 
We all generate a little bit of data.
Data mining is about taking this raw data, 
and transforming it into something more useful, 
information, perhaps, or predictions,
predictions about what might happen next, 
predictions that can be used in the real world. 
Let me give you an example. 
You're standing at the supermarket checkout. 
Every item is recorded by the till and 
at the end, you give them your loyalty card, and 

Chinese: 
大家好！我叫Ian Witten，我来自新西兰的Waikato大学。
今天我向你们介绍一门新的免费的在线学习课程，
“Weka在数据挖掘中的运用”。
我们被当今世界各种各样的数据包围着。
当我们在超市购物时，
当我们使用信用卡时，
当我们发邮件时，
当我们在键盘上敲入字符时，
当我们打电话，发短讯，走过监视器时，
都会产生数据。数据挖掘就是从这些原始数据中
找出有用的信息，
以预测将来可能发生的事，
在现实中运用这些预测。
我举个例子吧，
假如你正站在超市的收银台前，
收银机记录下了你买的每一样东西，
然后，你出示优惠卡， 

English: 
they'll give you 2% off, 
and you'll give them your name and address 
and access to all sorts of other data, 
demographic data, about people like you, and so on.
You've got lots of bargains today; it's  
been pretty good.
Thanks to those coupons they sent you in the mail
last week, you've been able to stock up 
on some items that you wouldn't normally 
buy, but you've bought today, because 
you got some money off. And next week, 
they'll send you some more coupons. 
They'll take this data, 
they'll analyze it, they'll include data from thousands or 
millions of people like you. 
They'll do little experiments, 
to find out if they reduce the price of an item just a little bit 
are you going to buy more of that?
These coupons are a mechanism for 
individual prices—prices set just for you. 
Everyone benefits, you get a bargain, 
everyone loves a bargain, the supermarket 
sells more stuff. And it's all thanks to 

Chinese: 
得到2%的折扣。
你提供姓名，地址，各种各样的资料，
和一些关于你的私人信息，等等。
你今天得到了实惠，
听起来很不错。
这得感谢商家上星期通过邮件发给你的优惠券。
你买了些通常不会买的东西， 
因为今天打折扣。
下星期，商家还会发给你更多的优惠券。
商家们收集这些
和你类似的成千上万的人的数据，
进行分析和小规模的实验。
如果他们对某件商品降点价，
试探你是否会多买点？
那些优惠券上的价格是专为你制定的。
看起来各方都受益。你享受了实惠，人人喜欢实惠。
商家卖出更多的东西。这一切得归功于数据挖掘。

English: 
data mining. This MOOC is called Data Mining with Weka. 
Let me tell you what a weka is. 
A weka, actually, is a little bird, 
like it's better known relative, the kiwi, 
found only in the islands of New Zealand. 
Flighless. About the size of a duck, actually. 
I don't know if you can see any ducks 
in the picture, but it's about just the size of those 
ducks out there on the lake. 
In our case, Weka is 
a toolkit, a data mining toolkit, a work bench. 
It's an acronym for Waikato Environment for Knowledge 
Analysis. it was produced here at the 
University of Waikato. We've had a machine-
learning project going on here for 
over 20 years now. We do research 
on machine-learning, and one of the
 outcomes of that research 
is this Weka workbench. A lot of 
people are starting to take data mining very 
seriously. You've heard about big data; 
you might have heard a lot about 
metadata recently, and what you might learn 
from metadata.
A lot of people find data mining mysterious. The real 

Chinese: 
这是门关于数据挖掘的开放式网络课程（MOOC）。
现在，我来说说什么是weka？
weka其实是一种鸟。
就象它那有名的kiwi鸟亲戚，
仅生活在新西兰的岛屿上。不会飞，象鸭子那么大。
不知你能否在图像上看到鸭子。
它就和我背后湖里的鸭子一样大。
但我们这里说的weka
是一个用于数据挖掘的工具包。
它是Waikato Environment for Knowledge Analysis的缩写。
是由Waikato大学研发的。
我们的机器学习课题已进行了20多年，
weka就是成果之一。
现在，越来越多的人开始重视机器学习。
你已知道海量数据，
你或许也知道些元数据，
和可从元数据中提取的信息。
许多人认为数据挖掘很神秘。

English: 
aim of this course is to take the mystery out of data mining, 
to get you some practical 
experience actually using the Weka toolkit
to do some mining on the data sets that we provide, 
to set you up so that later on, you can use Weka to 
work on your own data sets and do your own data mining. 
It doesn't involve any programming or 
anything like that. You're going to be using the 
tools that we provide, 
the Weka tools. It might 
help to know a little bit of 
elementary statistics, like means, 
variants, standard deviations, and so on. 
You might see a couple of mathematical 
formulas, but I'll explain those, so 
don't worry about that. You don't really 
need any specific mathematical background. 
The course is going to involve a number of 
short 5-10 minute videos, like this. 
Each one will be followed by a practical activity. 
You're going to be doing something 
on your computer using the Weka workbench.
Weka is free, open source software. 
It runs on anything-Windows, Mac, Linux. 
There will be some short videos followed by an activity 

Chinese: 
这门课向你揭开其神秘面纱。
你将用weka工具和我们提供的数据集
学到一些实用的数据挖掘方法。
这样，你将来可将weka运用到你自己的数据集
和数据挖掘问题中。
这门课不涉及编程。
你将学习如何使用weka。
知道些统计的基础知识，
象平均值，差异，标准差，等会有些帮助。
有可能你会遇到些数学公式。但我会解释他们，所以不用担心。
你不需要具备特别的数学知识。
这门课将提供些象这个一样的5-10分钟的视频。
每个视频后有个练习。
你将用安装在你计算机上的weka来完成。
weka是免费的开源软件。
它可在任何操作系统上运行--Windows，Mac，和Linux。
每个练习后可能还会有一段5-10分钟的视频。

English: 
that might take another 5-10 minutes. 
We call that a lesson, about 
15-20 minutes worth of work.
There are six lessons in each class; 
there are five classes altogether. 
About one class a week is the kind of 
rate that we would expect you to take this. 
You're going to be doing about three hours of work a week 
for about five weeks. To take part in this course, 
you need a computer, of course, with an 
internet connection; all of the videos are on 
Youtube. You need a little bit of time. 
You need a Google account to access these things, 
and you need some motivation and 
interest in the subject. Associated with 
the course is a text book, 
called Data Mining. 
It's a really good book on 
data mining. I know that, because I 
wrote it myself with a couple of friends. 
The publisher, Morgan Kaufmann, has kindly agreed to 
give people on the course free access 
to large chunks of this textbook online. 

Chinese: 
我们把这些叫做一节课。
一节课大概要15-20分钟
每段有6节课，总共有6段。
我们希望你每星期能完成一段。
这大约是每星期3小时的工作量。
一共持续5个星期。学习这门课，
你需要一台电脑，能上网（所有的视频在优酷上）。
你需要时间，
你需要个Google的帐户，
你需要动机和兴趣
这门课使用的教科书叫
“数据挖掘”
这是一本非常好的关于数据挖掘的书。
我知道它非常好，因为这是我和我的朋友们写的。
这本书的出版商Morgan Kaufmann
非常慷慨地允许在线阅读大部分章节。

English: 
In order to get your certificate of completion, 
there's a couple of assessments, one in 
the middle of the course and one at the end. 
If you do sufficiently well on those, you'll be 
getting an official 
certificate of completion from the University of Waikato.
That's it—Data Mining with Weka, coming soon to a computer 
near you. I'm looking forward 
to it, and I hope to see you there. 
Bye for now!

Chinese: 
为了拿到结业证书，
你必须完成一个期中和一个期末的测试。
如果你取得令人满意的成绩，
你将获得
由Waikto大学颁发的结业证书。
“Weka在数据挖掘中的运用”的开放式网络课程即将开学，
期待您的加入。
再见。
