
English: 
♪ (intro music) ♪
Hello, everybody.
(applause)
My name is Mustafa.
Today I'll talk about high level APIs.
But I'll keep practitioners
like you in mind.
So to keep the practitioners in mind,
we'll define an example
where you can improve user happiness
by power of Mission Learning.
And after defining our example project,
we'll use Pre-made Estimators
to start our first experiment.
Then we'll experiment more
with every feature we have
by Feature columns and we'll introduce
a couple of Pre-made Estimators
that you can experiment more.
And we'll learn how can you experiment
other modeling ideas too.
So there are a lot of topics
we will cover in this talk.
And we'll also talk about
how can you scale it up

Chinese: 
大家好
我是Mustafa
今天我们讨论的主题是高级APIs
我会关注每个编程者的实践经验
为了做到这一点
我们来设定一个项目
从而展示如何通过目标学习
提升用户界面的满意度
确定事例项目后
我们将使用预开发Estimators
开始第一次评测
随后以现有信息进行更多评测
并使用更多的预开发Estimators
从而进行更多评测
我们也将学到如何验证其他模型和概念
今天的讲座将会涵盖很多内容
我们还会谈谈如何扩大规模和应用

Chinese: 
从而体现在产品中
现在我们就从Estimators讲起
Estimators是帮助你专注评测的类
这里有上千名工程师
这不是一个小数目
谷歌有上百个项目在使用Estimators
我们从中学习到很多
我们创造出相应的APIs
从而让构思到实践的过程
尽可能缩短
很高兴能在这里和大家分享这一切
我们在谷歌等公司内部使用的工具和方法
同样适用于开源应用
所以你们将使用相同的工具
Estimators让可以测试模型功能
后面我们也会谈到模型功能
它决定着你的网络
和调用训练的方式
以及在评估期间的行为模式
或在输出时的行为模式

English: 
and serving so that you can use it
in your products.
Let's talk about the Estimators.
So it's a library that lets you focus
on your experiment.
There are thousands of engineers,
this is not a small number,
and hundreds of projects in Google
who use Estimators.
So we learn a lot from their experiments.
And we created our APIs
so that the time from an idea
to an experiment will be
as short as possible.
And I'm really happy to share
all these experience with all of you.
Whatever we are using internally
like Google
is the same as the open source one.
So you all have the same things.
Estimators keeps the model function.
We'll talk about
what's the model function later.
But it defines your network
and how can you train
or what is the behavior
during their evaluation
or during the export time.

English: 
And it provides you some loops
such as training, evaluation
and it provides you some interface
to integrate with TF Serving.
Also Estimators keep sessions
so that you don't need to learn
what is tf.Session.
It handles it for you.
But you need to provide data
and as Derek mentioned,
you can return a tf.Dataset
from your input function.
So let's define our project
and start with our experiment.
I love hiking, this is one of the pictures
I took in one of my hikes
and let's imagine there is a website,
hiking website similar to IMDB
but it's for hiking
and that website has information
for each hike
and users are labeling those hikes
by saying,
"I like this hike."
"I don't like this hike."
"This is my rating."
And other stuff.
And we want to use this data,
let's imagine you have this data

Chinese: 
它可以提供一些循环
如训练和评估
也可以在TF Serving下为模型部署提供界面
Estimators的使用可以减轻工作量
你不必学习什么是tf.Session
Estimators会帮你完成
但你需要提供数据
正如Derek所说
你可以在输入功能中
收回tf.Dataset
好了，现在我们就开始定义项目
开始测评
我喜欢徒步
这是我在徒步时拍摄的照片
假设有一个像IMDB这样的网站
但是专注与徒步的内容
网站上有所有徒步信息
用户可以标注“我喜欢这个徒步”
“我不喜欢这个徒步”
“这是我的评分”
等各种标注和评价
我们希望使用这些数据

English: 
from that website
to recommend hikes for users.
How can we do that?
There are many ways of doing it.
Let's define one way
that Mission Learning can help us.
In this case, we want to predict
probability of a like,
whether a given user will like
a given hike or not.
What you have, you have hike features
and user features.
And where can you learn from?
You learn from the label data,
in this case, that website
has whether that users like
that hike or not.
So what can we use to predict
probability of a like?
You change one of the Pre-made Estimators
in this case the DNNClassifier
because this is a more strict edition
or you can think
it's a binary classification problem.
So with design Pre-made Estimators,
so that it can fit well
to this kind of problems,
this means you can use it
as a black box solution.

Chinese: 
假设我们从网站拿到这些数据
用来向用户推荐徒步路线
我们可以怎样做呢？
办法有很多种
我们来定义一个可以使用Mission Learning的方式
在这个案例中，我们要推算“喜欢”的可能性
用户是否会喜欢推荐的徒步路线
我们有的数据是徒步资料和用户资料
从哪里可以开始学习和连结呢？
在这个案例中，我们从网站的标签数据开始
数据显示用户喜欢或不喜欢某些徒步路线
我们怎样能预测出用户喜欢的路线呢
我们来改变其中一个预开发Estimators
这个案例中我们使用DNNClassifier
因为这是一个更严谨的版本
或者你可以将其视为一个
二进制分类问题
所以设定好预开发Estimators后
就可以解决这个问题
这一类问题都可以这样解决
就好像一个黑箱方案

English: 
Pre-made estimators are surprisingly
popular within Google
and in many projects.
Why that many engineers are using
pre-made solutions
instead of building their own models?
I think first of all, it works.
It handles many implementation details
so you can focus on your experiment.
It has reasonable defaults
for initialization, partitioning,
or optimization
so that you have a reasonable baseline
as quick as possible.
And it is easy to experiment
with many features.
So we learn about that.
You can experiment with all of your data
by using the same estimators
without changing it.
So let's jump into our first experiment
so that we can have a baseline
that we will improve.
I will talk about embedded column
but in this case, let's--
you are using hike ID.

Chinese: 
预开发estimators在谷歌内部非常受欢迎
也被应用于很多项目开发中
为什么网路工程师们选择这一工具
而不是自己设立测评模型呢
原因有以下几点
首先它很好用
它可以处理很多操作细节
从而让你专注于测评
它有合理的初始化，分割算法和优化能力
可以帮助你设立合理的基线
从而节约了大量时间
它的测评方式简便
我们可以得到很有有用资料
使用同一个estimators
可以测评所有的数据
无需更换其他工具
现在我们回到第一次测评
以获得基线
然后在这个基础上改进
我会谈到序列嵌入
在这个案例中，使用的是徒步ID

English: 
It may be a hike name
or an identification as a single feature
to your model.
And let's say you instantiate
the DNNClassifier
with hidden units one.
So what this will learn,
it will learn other each label
for each hike ideas
so that may be a good baseline
for your overall progress.
You need to say
what is your evaluation data,
what is your training data,
then you can code train and evaluate.
Just by these,
a couple of lines, of course,
you should be able to experiment
and you can see the results
on the task report.
For example you can see training
and evaluation loss
or how accuracy metric is moving.
So since this is a classification problem
you will see accuracy metric.
And since this is binary classification,
you will see
Area Under Curve for Precision-Recall.
So all of these things are free
and ready to be used.
Let's experiment more.

Chinese: 
它可能是一条徒步路线的名称
或是某一特定功能
在你的模型结构中
我们将DNNClassifier
设定为隐形模块
我们可以学到的
也会体现在其他徒步信息上
这将会是一个很好的基线
对整个项目进程而言
你需要分辨出哪些是评估数据
哪些是培训数据
从而为评估和培训编写代码
这样你会有几条基线
可以测评并得出结果
这些都体验在测评报告中
例如，你可以看到训练和评估的损失
或者准确的趋势图
因为这里解决的是分类问题
你会看到准确的衡量结果
又因为这属于二进制分类
你可以得出查全率曲线
这一切都是免费的可用数据
现在让我们来测试更多

Chinese: 
我们从数据自身开始
使用同一逻辑进行功能设计
我们希望保持它
易于测评，方便功能和数据使用
基于经验和内部测评
得出基线，也会改进模型质量
通过这一功能可以实现很多转换
这些是桶形，交叉，散列法，和嵌入法
每一类都需要精细测算
很遗憾今天时间不够
你可以随后查阅Magnus手册和Jess' media
他们都非常有用
现在我们来测评徒步功能
每个徒步路线都带有标签
如“儿童适宜”，“宠物适宜”，“鸟类区域”
这里应选择指示功能而非嵌入功能

English: 
Let's start with the data,
experimenting with the data, itself.
The design feature columns
with the same mindset,
we want to keep--
make it easy to experiment
with your features, with your data.
And based on our experience,
internal experience,
it introduces the lines, of course
and may improve the model quality.
There are a bunch of transformations
you can handle via feature columns.
These are bucketing, crossing,
hashing, and embedding.
Each of these needs a careful explanation.
Unfortunately, I don't have
enough time here
but you can check Magnus' tutorial
and [Jess' media].
They are really good.
Let's experiment
with all the hike features we have.
Each hike may have tags
such as kid-friendly,
dog-friendly, birding.
You may choose indicator column
instead of embedding column in this case

Chinese: 
因为我们没有大量的标签数据
所以无需读取这一维度
而在数值单位下，如每条徒步路线
会得出需要标准化的评估增益
因此会实现很好的优化
问题将会得到妥善调整
这里可以使用标准化功能
例如我们选择桶状方式，徒步的距离
帮助模型
在不同的区块得到改进
你可以将其视为不同的标准化方案
怎样把一切结合起来使用呢？
放在一个清单里吗？
这样就可以了
你的系统将会这样运作
现在我们来测评个性化
个性化和推荐的区别是什么
同一个徒步路线面向所有用户
让我们来依据用户兴趣
向不同用户推荐不同路线

English: 
because you don't have
a huge number of tags
so you don't need
to read this through dimensionality.
And for a numerical column,
such as each hike
may have elevation gain,
you need to normalize
so that optimization
will be well-conditioned,
your problem will be well-conditioned.
And you can use normalizer function here.
Or you may choose bucketizing,
in this case, the distance of a hike
we bucketize it
so that it will help the model
to learn different things
for different segments.
You can consider it
as a different kind of normalization too.
How can you use
all of these things together?
Just putting them into a list?
That's it, then your system
should work as it is.
So let's experiment with personalization.
What we mean by personalization
is instead of recommending
the same hikes to all users,
let's recommend different hikes
for different users
based on their interests.

English: 
And one way to do that
is using user features.
In this case, we are using user embedding
by embedding column.
So this will let the model to learn
a vector for each user
and put the users closer
if their hike preferences are similar.
And how can you use that?
Again, it's just depending into your list.
And you need to also play with
a number of hidden units
because you have many more features now
and you need to let your model
to learn different transformations.
And the rest of the pipeline should work.
You will hear this sentence a lot
during this talk
because it's based on that.
Rest of the pipeline should work
and you should be able
to analyze your experiments.
Let's experiment more.
We have a couple of pre-made solutions
I mentioned to you it's very popular.
And I picked on the two of them here
to show.

Chinese: 
实现方法之一是使用用户特征
这里我们使用的是嵌入法
嵌入功能
从而改进模型
识别每个用户矢量
把偏好相同的用户集中起来
怎样使用呢？
同样，这取决于你的清单
需要在隐藏单元输入数字
因为现在有很多功能了
你的模型需要有不同的转换方式
余下的数据都可以得到利用
这些内容在这场演讲中你会听到很多次
因为这是基础
余下的数据可以利用
你可以分析并进行测评
现在我们来继续测评
我提到过我们有很多预成型的有效解决方案
这里选择两个进行展示

Chinese: 
其中一个叫做Wide-n-Deep
它是Neural model和Neural Network的联合培训
可能对你的产品有用
你需要测试才能知道
现在我们就来测试一下
先决定输入Neural Network的功能
我们得到功能列表
现在需要确定功能
使用线性方式
这里我输入的是用户ID和标签
如果这个用户总是选择“宠物适宜”路线
模型将会明确记录下这一点
你可以可以使用DNNLinearCombinedClassifier
而不使用DNNClassifier
剩下的数据也可以得到利用
根据2017年发表的Kaggle调查报告
树形功能非常受欢迎

English: 
One of them is Wide-n-Deep.
It's a joint training of Neural model
and Neural Network.
It may have your product or it may not.
You need to experiment.
So let's start our the experiment.
You need to define what are the features
you want to feed to Neural Network.
Again, we have feature column,
here's the list.
And you need to define
what are the features
you need to feed into linear part.
In this case, I intentionally put user ID
and tags crossing so that,
for example, if a user always picks
dog-friendly hikes,
the model will explicitly learn
via this feature, via this cross.
And you can instantiate the
DNNLinearCombinedClassifier
instead of the DNNClassifier.
And the rest of the pipeline should work.
Based on the Kaggle survey,
2017 Kaggle survey,
trees are extremely popular.

English: 
And we are introducing
Gradient Boosted Trees
as a pre-made estimator.
This means you can experiment
with Neural Network
and Gradient Boosted Trees
without changing your pipeline.
Let's start with our experimentation
with Gradient Boosted Trees.
In the current version,
we only support bucketized column
and we are working on to support
numerical column
and categorical column too.
Here we use height distance
and high elevation gain
and we bucketize them
and then you can instantiate
BoostedTreesClassifier.
And the rest of the pipeline should work.
We know that trees are not
as computationally expensive as--
or training trees are not
as computationally expensive
as training Neural Networks.
And many of datasets fit into memory.
So by leveraging that,
we provide you a utility

Chinese: 
所以我们推出梯度提升树（Gradient Boosted Trees）
作为estimator的一部分
这意味着你可以使用Neural Network
和Gradient Boosted Trees进行测试
而无需更换数据库
下面我们来进行梯度提升树测评
目前的版本仅支持桶形分析
我们希望它也能做数值分析
以及分类等
我们采用height distance和high elevation gain
并以桶状分析举例
BoostedTreesClassifier
余下的数据也利用起来了
树从计算方面来讲
或者说培训树从计算方面来讲
比Neural Networks便宜
会有很多数据记录
通过充分利用这些
我们可以得出一个函数

Chinese: 
从而使模型较一般状况
更加快速和有序
余下的数据也可以利用
我们假设这个预成型方案还不够
你想测试更多概念
现在来看看则么办
在介绍这一高级方案之前
让我们先了解一下反馈网络
在有监督的设置下
这是一个有不同功能的网络
基于数据输出和标签，你需要决定
损失在哪里，需要减少的部分是什么
衡量标准是什么
什么是成功的衡量
以更好进行测评
预计这里serving的时间和training是不同的
例如，如果你有大量的多重类别数据

English: 
so that you can train your model
more than an Order of Magnitude faster
than the usual one.
And the rest of the pipeline should work.
So let's say this already
pre-made solutions
are not enough for you
and you want to experiment more ideas.
Let's talk about that.
Before diving into this
high-level solutions that you can use,
let's look at the anatomy of the
Feedforward network
in a supervised setting.
In this case you have a network
which you'll fit the features
and based on the output of network
and the labels, you need to decide
what is the loss, what is the objective
you want to minimize
and what are the metrics
that you will use as a success metric
for your evaluation.
And your predictions on serving time
may be different than in training time
for example, if you have a large
multi-class setting,

English: 
you may want to use
just ranking of the classes
instead of the probabilities
in the serving time.
For that, you don't need
to calculate probability.
You can use the logistics
out of network to rank them.
So for all of this decisions,
we obstructed them out under Head API.
So Head API expects you to give the label
and out of your network
and provides all of these things for you.
We'll see in action.
And model function is an implementation
of this head and network together.
We'll talk about DNNClassifier.
So DNNClassifier has a model function,
so it's an implementation,
it's specifically an implementation
for head and network.
Let's implement this with the Head API,
in this case instead of DNNClassifier
we use DNNEstimator.
And we can instantiate ahead, in this case

Chinese: 
你可能只想以类别排序
而不是测算serving的时间
因此你不必计算可能性
可以使用逻辑进行排序
所有这些决策
我们在Head API中进行分析
Head API会得到一个标签
在系统外提供所有内容
我们来看看实际操作
模型功能是一种综合执行
我们以DNNClassifier为例
DNNClassifier具有模型功能
所以这是一个具体的执行指令
针对头文件和网络
我们使用Head API执行，
不用DNNClassifier
而是用DNNEstimator
就可以实例化

Chinese: 
这是一个二进制分类
因为我们在测算是或不是
这两条基线相同，为什么还要这样做呢
作为DNNClassifier
为什么要使用这个头文件呢
这样就继续更多测试
完成不同的网络架构
和不同的头文件
例如，可以在Poisson regression head
使用Wide-n-Deep
或在多标签头文件中使用DNNEstimator
还可以把不同的头文件结合起来
这里介绍一下multi-head
有一个测试多任务学习的方法
使用几行代码
就可以测试多任务学习
一起看一下
假设预成型的结构不足够
你需要测试更多
你可以写自己的模型功能
我强烈推荐使用TF Keras Layers
来打造你的网络

English: 
it's a binary classification head
because we are trying to predict
whether it is liked or not liked.
And why are we introducing this head
since these two lines is the same
as DNNClassifier,
why are we introducing this head?
So that you can experiment many more ideas
by combining different
network architectures
and different heads.
For example, you can use Wide-n-Deep
with the Poisson regression head
or the DNNEstimator
with a multi-label head.
You can even combine
different heads together,
we introduce multi-head here.
So it's a one way of experimenting
with multi-task learning
with a couple of lines of code,
you can experiment
with multi-task learning.
Please check it out.
Let's say this pre-made architectures
are not enough for you
and you want to experiment more,
you can write your own model function.
We strongly recommend to use
TF Keras Layers
to build your network.

Chinese: 
你可以自由发挥
在得到网络的数据输出后
你需要选择一项进行优化
可以在TensorFlow中使用头文件
像我刚才提到的
这样可以把网络和标签
转化成训练行为或评估度量
或输出行为
这样可以测评模型的功能
再次，其他的数据也同样得以利用
另一个建模方式是使用Keras Model
这也非常常见
这是一个Keras Model的例子
怎样获得estimator呢
从而让其他数据可以使用呢
使用模型测评就可以了

English: 
So you can do whatever you want,
you can be as creative as possible.
And after you have the output of network,
you need to pick one
of the optimizers available
in TensorFlow
and you can use one of the head
that we mentioned
which will convert your network output
and the labels
into training behavior
or evaluation metrics
or export behaviors.
Then you can fit this model function
to the estimator,
again, the rest of the pipeline
should work.
Keras Model is another way
of creating your model.
And it's very popular,
it's very intuitive to use.
For example, this is one
of the Keras Model you can build.
So how can you get the estimator
so that the rest
of the pipeline should work?
You can use model to estimators
which gives to the estimator

Chinese: 
这样就就无需更改即可进行测评
Transfer Learning是另一个有用的技术
可以做测试使用
有一个扩展方式是使用模型A
这是已验证的有效方式
从而可以改进模型B
应该怎样做呢
非常简单，只要复制并转化变量就可以了
从模型A到模型B
简单有效
我们提供了函数
可以以WARM开始
例如，这一行说从模型A
转换到模型B
或者先确定模型A的子集，再转换到模型B
我们来看看图像功能，嵌入功能
分类和数值功能
如果有图像功能该怎样？

English: 
so you can run your experiments
without changing your pipeline.
Transfer Learning
is another popular technique
we do experiment with.
One way of extending is using model A
which is already the trend,
to improve the prediction of model B.
How can you do that?
Surprisingly, just copying
and transferring the variables
from model A to model B works.
That's simple but it works.
And we provide the utility for you.
You can use WARM start from--
For example this one line 
is saying that transfer all of the model A
into model B
or you can define a subset of model A
to transfer from model A to model B.
Let's talk about image features,
we talked about embedding,
categorical column and numerical features
but what if you have image features?

Chinese: 
怎样在数据库中使用它呢
这需要很多写很多行
可以输入一个图像数据代码
不用几行
或者可以，多亏TF-Hub
你之后可以学到很多
可以使用这一行代码
实现功能
叫做图像嵌入功能
这里你可以想到Jeff提到的OptiML
NASNet是OptiML模型的一种
非常好用
这一模型可以充分利用起来
我们使用NASNet进行功能划分
即使用NASNet的输出界面
作为数据库的功能
那么DNNClassifier又能怎样运用呢？
同样的，输入功能项即可

English: 
How can you use them in your pipeline
without doing--
it will take up a lot of lines, of course.
You can implement one
of the state data image classifier
which is not a couple of lines, of course.
Or you can, thanks to TF-Hub,
you will learn a lot later,
you can use this one line,
just one line from the hub
to instantiate the features column
which is called 
the image embedding column.
In this case, you may remember
Jeff mentioned OptiML.
NASNet is one of the OptiML model
and it's really good,
it's one of the state of the art model
you can use.
We will use NASNet as a featurizer
which means it will use only 
the output layer of NASNet
as a feature in your pipeline.
How can you use that
in the DNNClassifier we talked about?
Same, you are just depending it
into your feature columns, done.

Chinese: 
接下来可以继续测评了
现在我们完成了测评确定了模型
有些人需要扩大规模
扩大培训
这时可以使用multi-GPU
多样的GPU情景，Igor之后会讲到这一点
这里要注意的是无需改变estimator
也无需改变模型代码
只要一行功能代码
一切都可以实现
如果你需要把培训分化为不同的任务
如服务器参数等
这里就会有一个评估
同样，无需更改estimator或模型代码
一切都以配置为基础
你可能想使用TPUEstimator或TPU
这里在模型功能上有一个小的改变
希望接下来我们能修正这一点

English: 
Then you can experiment it.
So let's say you experimented
and you find some models
but you need to scale it up,
not all of you but some of you
may need to scale your training.
So you can use multi-GPU
which means the application
on different GPUs,
Igor will talk about that after my talk,
the key point here
you don't need to change the estimator
or you don't need
to change the model code.
Everything should work
in just a single line
of configuration change.
Or you may want to distribute
your training to multiple missions
by saying these are workers,
these are parameter servers,
and there's one evaluator going on.
Same, you don't need to change
your estimator or model code.
Everything should work
based on the configuration.
Or you may want to use
TPUEstimator or TPU.
In this case, there's a minimal change
in the model function,
hopefully later we will fix that too
but now there's a minimal change

Chinese: 
这里需要做一点小改变
输出才能生效
或者可以使用
我建议你们使用TF Serving
这样就无需从文件里读取数据
如果有需求的话
下面要定义接收功能
决定了你怎样连结并为模型输入指令
在这个案例中，就是在模型输出之后
这里是由signature definition决定的
于是，这里几行代码
你可以用TF Serving输出培训模型
例如你需要TF数据原型
就可以使用这一函数功能获得相应功能
随后你可以使用输出模型
TF Serving也可以使用

English: 
in your model function you need to do.
To use this in your production,
you need to export.
Or you can-- you need to serve.
And we recommend you to use TF Serving.
In the serving time, instead of reading
data from the files, you have a request.
And you need to define
the receiver function
which is defining how can you connect
that request into the model.
So in this case-- and after that
what will be the output of that model.
Which is defined by signature definition.
So here, again,
a couple of lines of code.
You will export your train model
with TF Serving [trend relay].
For example, if your request
is TF data example proto,
you can use this utility function
to get your receiver function.
And then you can use export said model
so that it will be used by TF Serving.

English: 
These are the modules I mentioned,
tf.estimator, feature column,
TF Keras, contrib.estimator,
contrib.feature column.
Please do not use tf.contrib.learn,
we are deprecating it.
And these are a couple of talks
and videos I picked.
You can check it out.
Thank you, I hope some of you
will improve your products
with the tools that we mentioned.
(greeting in foreign language)
(applause)
♪ (ending music) ♪

Chinese: 
我提出的模型有这些
tf.estimator，feature column
TF Keras, contrib.estimator,
contrib.feature column
不要使用tf.contrib.learn
这里有几个相关的视频说明
大家可以去了解一下
谢谢，希望这对你们有所帮助
充分利用这些工具
