[MUSIC PLAYING]
DALE MARKOWITZ: Hey, I'm Dale.
I'm an engineer that works
on AI on Google Cloud.
TORRY YANG: And I'm Torry.
I'm also a developer
on Google Cloud AI.
DALE MARKOWITZ: Thanks
for coming into our talk.
So if you work at Google,
being in AI or machine learning
is a really good field
because sometimes it
seems like it's the
thing that everybody's
talking about everywhere.
When you say you
work on ML, people
think about the Google Assistant
or Waymo self-driving cars,
or AlphaGo, which was our
algorithm that beat the world
champion at the board game Go.
But, even besides these
very high profile projects,
machine learning is
in almost every app
within Google from translation
to photos to Gmail.
And if you just look at the
number of internal Google
repositories that
use machine learning
code over the past five years,
it's absolutely skyrocketed.
TORRY YANG: What about
AI beyond Google?
Well, for a long time
there's not as much
machine learning
outside of Google
because machine learning
is a complicated field.
It often requires specialization
and possibly advanced degrees.
And frankly, there are few
people with such specialty.
But there are a lot
of developers like us.
Not to mention, there's
the hardware aspect
and the devops component of it.
Even if you find all
the right people,
you have to worry
about where to put
these compute
intensive workloads.
By Google, we want.
to democratize machine learning.
We don't want to
gate keep developers
from using machine learning
in their application simply
because they don't have the
training or the machines
to do this.
That's why the topic
today is machine learning
without data science expertise.
DALE MARKOWTIZ: But
first, let's talk
about what machine
learning actually is.
The way that I like
to think about it
is just finding
patterns in data, which
might sound pretty simple.
But you can actually build
lots of awesome applications
with that concept.
Like for example, the algorithm
that recommends you music
based on your past
listening history,
or maybe you want to forecast
trends or build an AlphaGo,
a board game playing algorithm.
Or maybe you want to do a task
called classification, which
I'll focus on because it's
sort of the canonical machine
learning application.
Classification is essentially
just labeling things.
Maybe you have pictures
of cats and dogs
and you want to
assign them labels.
And the way that you normally
do this is roughly the same.
You always build a model.
And that model usually starts
with a label training data set.
So you'll have lots of
pictures of all sorts
of different breeds
of cats and all sorts
of different breeds of dogs.
And you want to have
as diverse a data set
as possible because the
more your model knows
about the world the more
accurate it will be.
You feed this into some
machine learning algorithm
and out pops a model.
And, if you've done a good
job and if have a lot of data,
hopefully when you show that
model a new picture of a dog
it's able to correctly
label it as such.
But if machine
learning truly is as I
say finding patterns in
data, that means you probably
kind of need data.
TORRY YANG: Well, what
if you don't have data?
Does that mean you
can't use the model?
Well, if you don't
have data, you can--
Google has got your back.
You can use one of
Google's models.
Here's a handful of models--
here's a handful of API's
that we have that you can use.
I'm going to quickly talk
about one or two of them.
And then we'll jump into
some examples and demos.
So there's Cloud Speech-to-Text,
which is converting
speech audio into text.
And there's Text-to-Speech,
which does the inverse,
converting text to audio.
And then there's also
Cloud translation,
which basically exposes
Google translate as an API.
Now let's jump into
an example right away.
So Cloud Natural
Language, what does it do?
It allows you to analyze text.
So here we have a couple
news article titles.
And we want to make
some sense out of it.
So let's hop into the
demo we have here.
Could we switch to the--
oh, there we go.
All right, so the first
demo I'm going to show you
is actually the landing page
of the Natural Language API.
So there's a bunch of
content you can read.
But what we're really
interested in here
is the demo that's
embedded in the web page.
We've copied and pasted
the first news article.
And let's click analyze
to see what we get.
And I'm not a robot.
All right, so it returns
these information by entities.
And you know, there's some
like internet accounts
they consider othered as models.
They consider that
consumer good.
And there's like [INAUDIBLE]
considered numbers.
You can also take a
look at sentiment.
So the sentiment
in this sentence
is generally pretty positive.
You can also take a look
at Centex if you want.
But what we're really interested
in today is the categories.
So if you look at
this, it kind of
corresponds with computers and
electronics, which makes sense.
Can we get back
to the slide doc?
All right, so if we look at the
first news article, computers
and electronics.
But if you pass the
next two articles
into the demo I
showed you, you'll
get these two other categories.
And all of this can be
done using a HTTP post
request, which really
simplifies your life.
You don't have to go
back to this web page
every time you want
to get information
about your news article titles.
And the same--
the similar rule--
you can use a similar REST
API to analyze images.
So here we have some images
about ice cream or dogs.
If you were to let Cloud
Vision analyze these images,
it would tell you well,
they're ice cream and dogs.
It's pretty cool.
DALE MARKOWTIZ: The
Video Intelligence API
is like the other
APIs Torry mentioned,
but it analyzes video.
So for example, I
can track objects
and put them in bounding boxes.
It can transcribe any text
that was said in a video.
It can also detect
explicit content.
And to show you an example of
using this tool rather than
the curl request that
Torry showed you,
I'll show you how to
do this in JavaScript.
It's quite concise.
You actually just call the
video intelligent client
and then you upload your
video to the Google Cloud
and you pass the Video
Intelligence client the link
to that video in the cloud.
Then you'll also pass it the
features that you want to use.
So, in this case, I'm like
OK, can you label this video.
So tell me all of the different
objects that you find.
And then when the API
returns, it will give me
data in the form
of oh, there was
a dog that appeared
between 6 seconds, 3 and 5.
But rather than
talk about this, let
me just show you how you
can use it to build an app.
This one is a shout out
to my colleague Zach.
Thank you for building this.
Can we go to the laptop, please?
So here's our Video
Intelligence app.
And we use the Video
Intelligence API to analyze
all sorts of data about videos.
So I'll tell you what
it does and then I'll
show you because of the audio.
So first we transcribe
the entire video.
And we're also able to
track objects in real time.
So let me show you
a demo of this.
[VIDEO PLAYBACK]
- Today we're at Starbucks with
Google's Made W/ Code to teach
girls how to code a lovely
emoji onto our hot chocolate.
- I'm drawing in the emoji
into [INAUDIBLE] code.
- If you want to go
into computer science,
you be that person
to decide that.
Don't let anyone say no to you.
[END PLAYBACK]
DALE MARKOWTIZ: And that's
the Video Intelligence API.
Let's head back to the slides.
TORRY YANG: All right, so we've
seen a bunch of off-the-shelf
API's.
But what about custom tasks?
Let me explain to you what
I mean by Custom Task.
So we've seen, given
a picture of a cat,
you give it to the
Cloud Vision API,
and you get back this response.
That's really great.
But what if, given the
same image of a cat,
you want to give a
mystery API and you
want to get back this
result, Maine coon.
Well, this is specific
to your use case.
This is not something the vision
API anticipated you might need.
So, in this case,
you're going to need
to train your own
machine learning model.
Well, it turns out this
is a fairly long process
at approximately these
number of steps, right?
I'll quickly walk through
some of these steps.
But don't worry if it's
complicated because it is.
All right, so first you have
to gather and preprocess
all these data and group it
into these different cat breeds.
And then you have to split the
data set up into a training set
and evaluation set.
And then you have to build
your TensorFlow or Keras model.
You have to build the model.
And then you have
to train the model,
and then you have to
evaluate the model.
And, finally, with
the model, you
have to deploy it somewhere,
maybe in the cloud,
maybe on your phone,
just so you can
get to the final step
of making a prediction,
just so you can upload
a photo into the API
and get the result you want.
Let's just say
Maine coon, right?
So in reality, you just
want to get from step one
to step seven as
fast as possible
without all of the
intermediate steps.
Well, have you all seen this ad?
It's a chore, right?
So you want to
make Google do it.
Make Google built
your model for you.
That's where Cloud
AutoML comes in.
What is called AutoML?
Cloud AutoML allows
you to bring your data
and give it to AutoML.
AutoML will take care
of training, deploying,
and serving your model.
And all you have to worry
about is querying the REST API.
And, if we go back to the seven
step process I showed you,
it really simplifies
the whole process.
You get your problem.
You get your data, and
you make a prediction.
All those intermediate
steps, Cloud AutoML
will solve it for you.
DALE MARKOWTIZ:
Today we're going
to build our own custom
AutoML model to do
a task called object detection.
That's when not only do we
want to know what's in a photo,
but we actually want to
know exactly where it is.
We want to find
the bounding box.
And today we're going to build
a model that identifies clothing
just like we've done on Sundar.
So how do we do it?
First, we need to find a label
data set like Torry mentioned.
And, to do that, we
went to Kaggle, which
is a website for data
science competitions
that also happens to host lots
of awesome free public data
sets.
And so we found almost exactly
the data set that we needed,
a clothing item detection.
So it had all
pictures like this.
And the human beings had
labeled pants, shirt, et cetera.
So this really helped
us get started.
But there was a problem because
these photos are a little bit
unnatural.
Everybody's just standing
on this white background.
And so, when we try
it on us in real space
the model doesn't work as well.
So, to fix this problem,
we augmented our data
set by actually
uploading our own photos
and outlining boxes
of our clothing.
And we did that for a while.
But then, to make the
process even faster,
we sent our photos
to Google's data
labeling service, which was a
feature that can have Google
annotate your data for you.
So, to see what
this looks like, you
write a set of
instructions in a PDF
of how you want your
data to be labeled.
So you'll say, for
example, I'm going
to give you all these photos,
put a box around my jacket.
But, if I'm wearing a shirt
underneath the jacket,
don't include that.
So you send your
instructions to raters.
And the results that you get
back looks something like this.
I'm sorry, I just dumped
this huge spreadsheet on you.
But basically, it's what
we found in the photo
and where the box that
surrounds it is located.
Now let me show you
how we actually go
from that to building a model.
Let's hop to the laptop.
So here I am in the
AutoML Vision UI.
Now there are two different
types of models we could train.
One says just labels
a photo cat or dog.
But object detection like I
was telling you about actually
puts the box around the items.
So our first step is to
upload our data set, which
I kind of showed you.
Just to give you
a closer peek, I'm
going to expedite
that a little bit.
So here are some of our data.
I'm going to hop into a
photo of our colleague page.
And if we wanted to label
this data set ourself,
we actually just
click and drag around
the thing we want to label.
And then we say this is a top.
And then we save it.
Great.
Now, definitely
preparing your data set
is the hardest part of
building a model with AutoML.
Training is actually
pretty simple.
We just hop into the Train tab.
We click train new
model, and then we
can select different
features like do
we want a more accurate
model or a faster model.
And then we click
Start Training,
and then we'll
get a message that
says something to the tune of
we're training a model for you.
Check back in four hours.
And then in four hours,
you get an email that
says take a look at your model.
In this evaluation tab, we can
see how well the model did.
So most of the time,
it will be better
at detecting some clothing
items more than others.
So let's just investigate
how it did on t-shirts.
There are lots of
statistics that
are useful to evaluate
your model here.
But I'll just show you
the most useful feature
for debugging I
think, which is seeing
the examples your model labeled
correctly and incorrectly.
So these true positives are
correctly labeled as t-shirts.
But here are some
false positives.
So I thought these were
t-shirts when they're not.
To be honest, I think
it's kind of hard to tell
the difference between--
whatever.
But this will help
you debug your model
and see where you
need more data.
And to actually
use that model, you
can test it out within the UI.
So I'll just show you uploading
a mirror selfie of myself.
That's me.
And there you go.
You can see it's never
seen this photo before.
But it can kind of
guess where my jacket is
and where my jeans are.
Now, in addition to just
trying this out in the tool,
you can call this
model the same way
we did the other APIs,
either through a curl request
or in a number of
client libraries.
So Google handles
all the complications
of actually hosting
your model in the cloud.
So let's hop back
onto the slides.
So what is actually
going on under the hood
when AutoML builds
that model for you?
Now this talk is machine
learning without data science
expertise.
So this is unnecessary.
You can cover your ears.
But in case you're just
curious to know what's
going on under the
hood, well, our photos
are being classified
using something
called a neural network.
But the first step before we can
feed our data into an algorithm
is that we have to go from
a photograph to numbers.
So I suppose we have
a photograph of a cat.
Well, that photograph
is actually
made up, if it's digital,
of pixel values--
red, green, blue, which
correspond to numbers.
So, using those pixel values,
we can convert our photograph
into data by making
it into three
arrays a red, green,
and blue pixel values.
Then that is the
input to this which is
a diagram of a neural network.
And, admittedly, I'm going
to sort of go over this
very quickly.
There are other great
talks that talk about it.
But, basically, what's
happening is those pixel values
are being added, multiplied,
combined, scaled, and then
passed to a higher level and
then passed to a higher layer.
And eventually, a number comes
out that says 80% dog, 20% cat.
And, at first, the
model doesn't know
how to combine those numbers.
But as you show it more
and more training examples,
it becomes more accurate.
If you're interested
in this, I definitely
encourage you to explore more.
But, normally, training
a model to do this
if you were to do it as
Torry said from scratch,
takes a lot of data, probably
more data than we had,
which was something
like under 200 examples.
But luckily, when
you use AutoML,
you're not using just your data.
You're actually
using Google's models
that have been trained
on lots more data.
So you can have a very
high quality model
even if you don't have that
much data to work with.
And these neural network
technologies power lots
of Google machine
learning products
like Speech-to-Text, that's
transcription, translation,
and object recognition.
AutoML also has lots
of different data types
you can pass
besides just images.
So let's talk about that.
TORRY YANG: So, yeah, one of our
new products is AutoML Tables.
It was just freshly
announced last month
at Google Cloud Next.
And what does AutoML Tables do?
Well, it deals with
structured tabular data.
So let's just say
there's this table here
and you have strings,
timestamps, numbers, classes.
And what you really want is just
predicting the target column.
So this is actually a
data set of a marketplace.
And people are selling
things in the marketplace.
We want to predict how much
those items are sold for.
Well, it turns out
there's a data science
competition for just this very
specific task I described.
And before I tell you how
AutoML Tables performed
in that competition,
we actually graphed
the results of every contestant
on this visualization here.
On the x-axis, we
have the ranking.
So the more towards the right,
the better the team did.
And then on the y-axis,
we have the error.
So the closer to the bottom of
the chart, the lower the error.
So, if you think about
it, the better contestants
have lower errors,
which makes sense,
and the worst contestants
have higher errors.
And here's how AutoML did.
In just an hour,
AutoML Tables is
able to surpass the
median competitor.
And given 11 more hours,
given 23 more hours,
AutoML can do better.
And what's interesting and
something I want to point out
is that all three of these
one hour, 12 hour 24 hours
is on the plateau, which I
think is where like the data
science enthusiast and data
science professionals hang out.
So that's really
cool for AutoML Table
to be able to
achieve such results.
DALE MARKOWTIZ: So Torry just
told you about tabular data.
But what about text, which is an
extremely common type of data.
For that, we can
build a custom model
using AutoML Natural Language.
And there are actually
lots of different types
of natural language
models that we can build.
So we could build a
classification model.
So maybe we have US
congressional bills
and maybe we want to categorize
them by what they're about.
We can build a custom
model to do that
with AutoML natural language.
And then, at Next last
month, we actually
released two new types of
models that we can build.
The first one is
custom sentiment.
So maybe you remember that Tory
showed you the Natural Language
API could take a sentence
and tell you whether it
was positive or negative.
But if you train your
own custom model,
then you can learn a very
terminology specific things
like that waiting
on the tarmac is bad
but having lots of
leg room is great.
You can also build a custom
entity extraction model.
So, again, out of the box,
the API can take a string
and tell you what entities are.
For example, it knows
that I, Dale, am a person
and a cappuccino
is a consumer good.
And it doesn't really
know what a croissant is.
But if I wanted to make a model
that really understood foods,
I could do that, and maybe say
that cappuccino is a beverage
and identify that
croissant as a food.
And, to do that, you create
a data set much like the one
I showed you for
object detection.
But to do this, you actually
take strings of text
and you label them in line
and you say this is a food,
this is a beverage, and so on.
TORRY YANG: So let's review.
Today, we showed you that,
if you don't have data,
you can use one of
Google Cloud AI's APIs.
But if you have data and you
have your own custom use case,
you can use Cloud AutoML.
And we showed you how
to build custom models
with different data types.
Here are some links
for you to visit.
DALE MARKOWTIZ: We
hope what you take away
from this is that you can
play with machine learning
while focusing on
building your app
and not on building your
model because you can get
started with these
tools really quickly,
probably in under an hour.
And you can build
awesome things with them.
So thanks for coming to listen.
We'll hang out here to answer
questions right now I guess.
TORRY YANG: All
right, thank, y'all.
[MUSIC PLAYING]
