
Chinese: 
♪ (音乐) ♪
嗨，大家好，欢迎回来
我叫Jeremiah，他是Andrew
我们来自TensorFlow Hub团队
我们驻于瑞士苏黎世
今天我们很高兴跟大家分享
TensorFlow Hub
事实上第一张幻灯片是我偷回来的
从一个同事那里拿走的
他叫Noah Fidel
负责TensorFlow Serving
Noah 用这张幻灯片
讲他的个人故事
它也讲述了我们在软件工程取得的进展
以及我们所使用的工具种类
也表明了随着时间的推移事情趋于成熟
他还谈到正在发生的类似的事情
以及我们用来做机器学习的工具
他在这些方面之间建立联系

English: 
♪ (music) ♪
Hello, everyone. Welcome back.
Welcome back.
I'm Jeremiah and this is Andrew.
We are here from the TensorFlow Hub team.
We are based in Zurich, Switzerland,
and we're excited 
to share TensorFlow Hub today.
So this first slide is actually one 
that I stole-- I took it from a colleague,
Noah Fidel, who leads TensorFlow Serving,
and Noah uses this slide 
to tell a personal story,
and it kind of shows the growth of things,
the type of tools that we use
to do software engineering.
And it shows how they mature over time.
And he also connects this 
to a similar thing happening,
the tools we use to do machine learning.
And he draws these connections.

English: 
We're rediscovering things
as we grow our machine learning tools,
things like the machine learning 
equivalent of source control,
and the machine learning equivalent
of continuous integration.
And Noah also makes 
this really interesting observation
that this growth 
on the machine learning side
is lagging behind 
the software engineering side
by 15, 20 years.
So this creates 
a really interesting opportunity, right?
We can look at software engineering.
We can look at some of the things 
that have happened there,
and think about what kind of impact 
they may have on machine learning.
And so looking at software engineering,
there's something that's so fundamental,
it's almost easy to skip over,
and that's this idea of sharing code,
shared repositories.
On the surface, this makes us 
immediately more productive.
We can search for code.
We can download it.
We can use it.
But it has these really powerful 
second order effects.

Chinese: 
随着我们的机器学习工具的发展
我们重新发现事物
例如机器学习相当于源代码管理
机器学习亦相当于持续整合
Noah亦得到这个有趣的观察
机器学习方面的发展
相比于软件工程方面的发展
落后15年，20年
这就创造了非常有意思的机会，对吧？
我们关注软件工程
关注这个领域发生的事情
并思考软件工程方面的发展对机器学习的影响
软件工程领域有一些非常关键的方面
很容易被忽略，那就是共享代码的想法
也就是共享储存库
表面上，这使我们提高生产力
我们可以搜索代码可以下载
可以使用它
但它还具有非常强大的二阶效应

Chinese: 
这实际上改变了
我们编写代码的方式
我们重构我们的代码
把它放在库里
我们分享这些库
这真的使开发人员
提高生产力
同样的动力
我们希望通过TensorFlow Hub
在机器学习中实现
TensorFlow Hub可让你们
构建，分享和使用
机器学习的方方面面
为什么这很重要呢？
任何从头开始使用
机器学习的开发人员都知道
你需要很多东西才能做得好
你需要算法，需要数据
需要计算能力和专业知识
如果欠缺其中任何一样
你都很难做好
TensorFlow Hub可以让你们将这些东西
注入到一个可重用的包中
我们称之为模块
这些模块放到TensorFlow Hub

English: 
This actually changes 
the way we write code.
We refactor our code.
We put it in libraries.
We share those libraries,
and this really makes people
even more productive,
and it's the same dynamic
that we want to create 
from machine learning
with TensorFlow Hub.
TensorFlow Hub lets you build, 
share and use
pieces of machine learning.
So why is this important?
Well, anyone who's done 
machine learning from scratch knows
you need a lot to do it well.
You need an algorithm; you need data.
You need compute power and expertise,
and if you're missing any of these,
you're out of luck.
So TensorFlow Hub lets you distill 
all these things down
into a reusable package we call a module.
Those modules go in TensorFlow Hub

English: 
where people can discover them
and then easily use them.
So you'll notice I'm saying module
instead of model.
It turns out that a model is 
a little bit too big
to really encourage sharing.
If you have a model, 
you can use that model
if you have the exact inputs it wants
and you expect 
the exact outputs it provides.
If there's any little differences,
you're kind of out of luck.
So modules are a smaller piece.
If you think of a model
like a binary,
think of a module like a library.
So on the inside, a module 
is actually a saved model,
so this lets us package up 
the algorithm in the form of a graph,
lets us package up the weights.
You can do things like initialize,
use assets.
And our libraries make it very easy 
to instantiate these
in your TensorFlow code.
So you can compose these 
in interesting ways.

Chinese: 
使人们可以在那里发现它们
并很容易就能使用它们
你们会注意到我在说模块
而不是模型
事实证明，模型有点太大
无法真正促进分享
如果你有一个模型
你就可以使用那模型
只要你有它想要的确切的输入
而你亦期望它提供确切的输出
如果有什么细微的差异
你就无法得到预期的结果
模块是较小的东西
如果你把一个模型想成二元
模块就像一个库
在里面，模块实际上是
一个已保存的模型
让我们能够采用图形的形式
打包算法
让我们打包重量
你可以做一些事情
例如初始化，使用资产
我们的库使它变得非常容易
使这些实例化
到你的TensorFlow代码
你们可用有趣的方式撰写

English: 
This makes things very reusable.
You can produce one 
of these and share it.
These are also retrainable.
Once you patch it in 
to your bigger program,
you can back propagate 
through it just like normal.
And this is really powerful
because if you do happen
to have enough data,
you can customize the TF Hub module
for your own application.
And to tell us a little bit more
about some of those applications,
I'll hand it over to Andrew.
Thanks, Jeremiah.
Let's look at a specific example
of using a TF Hub module
for image retraining.
Let's say that we're going to make an app
that can classify 
rabbit breeds from photos.
The problem is we only have
a couple hundred examples,
probably not enough to train
an entire image classification model
from scratch.
But what we could do is start 
from an existing general purpose
image classification model.
Most of the high performing ones
these days are trained

Chinese: 
这使得这些东西可重用
你可以制作其中一个并分享它
这些也可以再训练的
一旦你把它修补
到更大的程序
就可以通过它反向传播
就像正常的一样
这真的很强大
因为如果你真的
有足够的数据
就可以为你自己的应用程序
自定义TF Hub模块
为了让我们跟多知道
有关这些应用程序的东西
现在我交由Andrew分享
谢谢，Jeremiah
我们来看看
使用TF Hub模块
来进行图像再训练的具体例子
假设我们要创建一个应用程序
可以从照片中把兔子品种分类
我们面对的问题是
我们只有几百个样品
或不足以从头开始训练
整个图像分类模型
我们可以做的是
从现有的一般图像分类模型开始
如今那些性能强大的模型

English: 
on millions of examples
and they can easily classify
thousands of categories.
So we want to reuse the architecture
and the trained weights of that model
without the classification layers.
And in that way, we can add
our own rabbit classifier on top,
and we can train it 
on our own rabbit examples
and keep the reused weights fixed.
So since we're using TensorFlow Hub,
our first stop is tensorflow.org/hub
where we can find a list 
of all the newly released state of the art
and also the well-known image modules.
Some of them include 
the classification layers,
and some of them remove them
just providing a feature vector as output.
So we'll choose one of the feature
vector ones for this case.
Let's use NASNet which is 
a state of the art image module
that was created 
by a neural architecture search.
So you just paste the URL
of a module,

Chinese: 
都曾经接受数百万个样品的训练
它们很容易地
对数以千计的类别进行分类
因此我们想重用这个架构
以及这个模型的训练权重
而无需分类层
这样我们可以在上面
添加我们自己的兔子分类器
用我们自己的兔子样品训练它
并保持固定的重用权重
由于我们使用TensorFlow Hub
我们第一站是tensorflow.org/hub
在哪里我们可以找到一份清单
描述所有新发布的最先进技术
以及出名的图像模块
其中一些包括分类层
而有一些移除了分类层
只提供一个特征向量作为输出值
因此我们为此选择其中一个特征向量
我们用NASNet
这是最先进的图像模块
这是通过神经架构搜索创建的
你只需粘贴模块的网址

English: 
and TensorFlow Hub takes care 
of downloading the graph
and all of its weights
and importing it into your model.
And in that one line, you're ready 
to use the module like any function
so here we just provide 
a batch of inputs,
and we get back our feature vectors.
We add a classification layer on top,
and we output our predictions.
But in that one line, 
you get a huge amount of value.
In this particular case, 
more than 62,000 hours of GPU time
went into finding the best architecture
for NASNet and training the result.
And all of the expertise, research, 
testing, that the authors put into that--
that's all built into the module.
Plus that module can be fine-tuned 
with your model,
so if you have enough examples
you can potentially get better performance
if you use a low learning rate,
if you set the trainable parameter
to true and if you use 
the training version of the graph.

Chinese: 
而TensorFlow Hub 就负责下载图表
以及所有的权重
并将其导入到模型中
在这一行代码中你已经准备好
像任何功能一样使用模块
这里我们只是提供一堆输入值
我们回到特征向量
在顶部我们添加了一个分类层
输出我们的预测
在那一行代码中
你会得到数量巨大的数值
在这特殊情况下
超过62,000小时的GPU时间
用来为NASNet寻找最好的架构
并训练其结果
开发人员投入的
所有专业知识，研究，测试
也用来投入到模块中
再者那模块亦可根据
你的模型进行微调
如果有足够的样本
就有可能获得更好的性能
如果使用低学习率
如果设置可训练参数为正确
如果使用图表的训练版本

Chinese: 
NASNet具有大尺寸
以及手机大小的模块
还有全新的渐进式NASNet
还有一些全新的MobileNet模块
用于进行设备上的图像分类
以及一些行业标准
例奴Inception及ResNet
完整的清单已经放在tensorflow.org/hub
当然所有这些模块都是预训练的
使用TF slim checkpoint
它们已经准备好用作分类
或作为特征向量输入值
放到你自己的模型
好吧，我们来看另一个例子
在这种情况下，做一些文本分类
我们想知道一条对餐厅的评论
是正面还是负面
正如 Jeremiah 所说
关于TensorFlow Hub
其中一件最棒的事情就是
它的图象带有数据
就有关我们的文本和嵌入模块
这意味着已包括所有预处理程序
做正常化之类的事情
以及标记操作

English: 
So NASNet is available in a large size 
as well as a mobile-sized module,
and then there's also
the new progressive NASNet,
and then on a number 
of new MobileNet modules
for doing on-device image classification
as well as some industry standard ones
like Inception and ResNet.
That complete list is 
at tensorflow.org/hub.
And of course all those modules 
are pre-trained
using the TF slim checkpoints,
and they're ready 
to be used for classification
or as feature vector inputs 
to your own model.
Okay, let's look at another example--
in this case, doing 
a little bit of text classification.
So we'd like to know whether 
a restaurant review is
a positive or negative sentiment.
So as Jeremiah mentioned,
one of the great things 
about TensorFlow Hub
is that it packages 
the graph with the data.
For our text and embedding modules,
that means that all 
of the preprocessing is included,
doing things like normalizing 
and tokenizing operations.

English: 
So we can use 
a pre-trained sentence embedding module
to map a full sentence 
to an embedding vector.
So if we want to classify 
some restaurant reviews,
then we just take one 
of those sentence embedding modules,
we add our own classification layer on top
and then we train with our reviews
while we keep 
the sentence module's weights fixed.
And just like for the image modules,
tensorflow.org/hub lists a number
of different text modules.
We have neural network language models
that are trained on genus
for English, Japanese, German and Spanish.
We also have Word2vec models 
trained on Wikipedia,
as well as a new module 
called ELMo that models
the characteristics of word use
and how those uses vary across context.
And then there's also something 
really new, as in today.

Chinese: 
我们可以使用预先训练的句子嵌入模块
绘制完整的句子到嵌入向量
如果我们想将一些餐厅评论分类
那么我们只需拿这些句子其中一句嵌入模块
我们在顶部添加自己的分类层
然后对我们的评论进行训练
而我们要固定句子模块的权重
就像图像模块一样
tensorflow.org/hub
列出了不同的文本模块
我们有神经网络语言模型
就英语，日语，德语和西班牙语
进行属中训练
我们也有Word2vec模型
针对维基百科进行训练
以及一个全新的模块
称之为ELMo模型
用词的特点
以及这些用途在不同环境下的变化
然后还有一些东西是全新的
正如今天一样

Chinese: 
今天早上你们或会看过一篇
来自Ray Ray Kurzweil团队的论文
这是Universal句子编码器
这是句子级别的嵌入模块
它曾接受各种任务的训练
可以执行各种任务
换句话说，即是通用的
它擅长的方面是识别语义的相似性
要做自定义文本分类
聚类和语义搜索
而它最厉害的的方面是
它只需要很少的训练数据
就可以使模块适应你自己的问题
在餐厅评论分类这个任务中
这听起来很棒
我们在餐厅评论任务上试试吧
我们只需粘贴那个网址
就像之前TensorFlow Hub下载模块
并将其插入到图像中一样
这一次我们使用文本嵌入列
这样我们就可输入估算器
进行分类任务
就像图像再训练的例子
这个模块也可以通过设置可训练为真

English: 
You may have seen a new paper 
this morning from Ray Kurzweil's team.
This is the Universal Sentence Encoder.
It's a sentence-level embedding module.
It's trained on a variety of tasks,
and it enables a variety of tasks,
in other words, universal.
So some of the things that it's good for
are semantic similarity
doing custom text classification,
clustering and semantic search.
But the best part about it 
is how little training data
is actually required
to adapt the module to your own problem.
So that sounds great 
in our particular case.
Let's try it 
on the restaurant review task.
So we just paste that URL from the paper,
and like before TensorFlow Hub downloads
the module and inserts it into your graph.
But this time, we're using 
the text embedding column.
That way we can feed into an estimator
to do the classification part.
And just like 
with the image retraining example,
this module can be fine-tuned

Chinese: 
对你们的模型进行微调
当然你们先要降低学习率
这样就不会降低现有的权重
但如果你们有足够的数据
这就值得探索一下
现在我们仔细看看那个URL
正如Jeremiah所说，
一个模块就是一个程序
所以要确定执行的程序
是来自你们信任的网站
在这情况下
模块来自tfhub.dev
这是我们的新来源
由Google提供的模块
就像NASNet
以及通用句子编码器
随着时间的推移
我们想创建一处地方
使你们也可以发布
由你们创建的模块
在这情况下，Google是发布者
模块的名称是通用句子编码器
最后，这是版本号1
TensorFlow Hub会认为
模块是不可改变的

English: 
with your model 
by setting trainable to true.
Of course, you have 
to lower the learning rate
so that you don't ruin 
the existing weights that are in there,
but it's something that's worth exploring
if you have enough data.
Now let's take a closer look 
at that URL.
As Jeremiah mentioned,
a module is a program,
so make sure what you're executing
is from a location that you trust.
In this case, the module 
is from tfhub.dev
so that's our new source
for Google-provided modules
like NASNet 
and a universal sentence encoder.
Over time, we would like to create 
a place where you can also publish
the modules that you create.
So in this case, Google is the publisher,
and Universal Sentence Encoder 
is the name of the module.
And finally, the version number is one.
So TensorFlow Hub considers modules
to be immutable,

English: 
that way you don't have to worry 
about the weights changing
between training sessions.
So that module URL 
and all of the module URLs
on tfhub.dev include a version number.
And you can take that URL
and paste it into your browser
and see the complete documentation
for any module that's hosted on tfhub.dev.
Here's the particular one 
for the Universal Sentence Encoder.
Then we also have modules
for other domains
besides text classification
and image retraining,
like a generative image module
that contains a progressive GAN
that was trained on [inaudible].
And we added another module
that was based 
on Deep Local Features network
which can identify the key points
of landmark images.
Both of those have great Colab notebooks
that are available on tensorflow.org/hub.
In fact, the images here 
were created from them.

Chinese: 
这样你们就不用担心
在训练期间权重产生变化
那个模块网址以及
tfhub.dev上所有的模块网址
也包含版本号
你们可以复制那个网址
并将其粘贴到浏览器中
以查看完整的文档
所有托管在tfhub.dev上的任何模块
这是通用句子编码器的特定模块
我们也有其他域名的模块
此外亦有文本分类和图像再训练
像生成图像模块
这个包含渐进式 GAN
用于[inaudible]的训练
我们亦添加了另一个模块
这是基于深度局部特征网络而作
这可确定地标图像的关键点
两者都有很棒的Colab笔记本
可在tensorflow.org/hub找到
事实上这里的图像是由他们创建的

Chinese: 
在接下来的时间
我们将要添加更多模块
用于音频和视频等任务
但最重要的是
我们很高兴看到
你们使用TensorFlow Hub
所创建的内容
当你们建立笔记本时
使用标签#tfhub
并共享模块
访问tensorflow.org/hub
就可找到我们的文档的链接
包括教程的样本
交互式笔记本及代码实验室
以及我们的全新讨论电邮列表
我们苏黎世团队的
Jeremiah，我和所有人
谢谢大家
(掌声)
♪ (音乐) ♪

English: 
And we're going to be adding
more modules over time
for tasks like audio and video
over the next few months.
But most importantly,
we're really excited to see
what you build with TensorFlow Hub.
Use the hashtag #tfhub
when you build notebooks
and share modules.
And visit tensorflow.org/hub
for links to our documentation,
including examples with tutorials,
interactive notebooks and code labs
and our new discussion mailing list.
So from Jeremiah, me and everyone
on our team in Zurich,
I want to thank you so much.
(applause)
♪ (music) ♪
