>> Welcome to episode 3 of
the four-part series where I gave
a talk at the Toronto AI User Group.
There's a ton of amazing content
in there, make sure you tune in.
[MUSIC]
>> Let's talk about data science
and the loop of sadness.
This is our data science
loop of sadness.
So when they ask me how do
I know which one to use,
I say, "Try them."
"What if it doesn't work?",
"Try a different one."
"How long does it take?"
"Go ask your data scientists
whose door is locked right now."
It depends on the usage.
They were nice to me.
At Microsoft, they let me,
this has a 2080 RTX in there,
I know, I can run m-nest
at a second per epoch.
All of a sudden,
I see all these hungry eyes and
not for the pizza. Holy cow.
So this is the bit that people
don't understand about development,
and if you look at data science code
from professional data scientists,
you can almost see where they
changed their mind, in the code.
That's why people complain about
data scientists code,
I say, "Slow down.
Data scientists don't write
code to put into production,
they write code to prove that a
model will work, so step off."
So let's talk about another thing.
What I did though, in this case,
is I forced the digits to
all be on a single line,
because I forced it to be a
linear model, did you see that?
But it turns out that
there's some relationships
here that are actually quite useful,
like this way that it's losing.
So how do we do that?
Well, let's talk about
filters and pooling.
So a filter is actually
a very interesting operation
that you would think
doesn't do anything.
So for example, here is
what this filter does.
Basically, we're going to
do this box right here,
we're going to dot-product
it with this box,
and then we're going to
put the answer here.
Then we're going to do it again here,
and then we're going
to do it again here,
and then we're going to do it again.
I'm just explaining
what this is doing,
and I'm not telling you why.
I'm just telling you [inaudible]
I'll show you why in a second.
So we get four numbers,
but this is not good because
it destroys the image.
We're losing pixels around the edges.
So you can do this
thing called padding.
It's called same padding,
by the way, if you're using it.
Now when we do it,
notice that we get
the exact same amount
of pixels back that we want.
Cool. Let's talk about
pooling and stride.
What if we want to
make the thing smaller
after we've done the
little filterings?
So basically, this
is called a pooling.
We're basically taking this box
and we're putting the
biggest number in here.
We'll take this box and we'll put
the biggest number in here,
and then we'll do that again.
Stride has to do with how
much do you move over.
So are you going to move over
just one pixel at a time?
Now you might be wondering
what does this actually
do? I will show you.
So basically, I wrote
a crop and resize,
I wrote a convolution in NumPy,
because everyone has all
sorts of spare time.
Then I also wrote a
pooling that takes
the max out of it, max pooling.
So by the way, that
filter is also called the
convolution, I said that too fast.
Now here is the same,
remember how we did 1, 1, 1,
0, 0, 0, minus 1?
I decided to try that one
out to see what happened,
and what I did is I took
some important pictures
that mattered to me,
and I'm going to go
ahead and try to run,
these are important pictures, you
saw wedding and thought mine.
Look what happens when I do this
convolution over this image.
How? What did it do?
It found the edges without
me doing anything.
What happens when I pull it?
Boom, it enhances them.
So you know how back
in the day when you'd
watch CSI Miami and they're like,
zoom, enhance, you're
like that stupid.
It turns out they can't
do that now, right?.
But that's this what
we're doing that.
What this is doing, is this is
enhancing and aspect of VMs.
Let's do with another picture.
Again these are important
pictures to me of the kids,
and let me do one that
actually makes sense, a fence.
That's crazy, that that
convolution is doing that,
it's basically creating
a new picture for you.
It turns out that when you
look at this operation,
it's a different form
of the dot product,
it's just rolled out differently.
So when you make these things,
you can actually create
these convolutions,
which are matrices, that go over
the image and are making
a new image for you.
They're basically making
new images for you.
Now, what images it making?
I don't know, it's making the images
that make the
optimization work better.
So it will choose these for
you and so if you do
this style of thing,
you get 99 percent on
m-nest, something higher.
You're basically doing
two convolutions,
a pooling and Aurelio over them
and then a softmax at the end.
But you always unroll it,
computer image will always unroll
it so that it does
something like this.
So these are called
convolutional neural networks,
because these convolutions what
they do, especially with images,
is they're able to generate
very complex images.
You've seen them,
where at the bottom,
it's like it found this
edge and then up in
a layer it's like it
found a nose, what?
Because it's basically generating
new images as it goes along.
Now the problem I have with this is,
is it seems awful magical,
the computer is learning about noses,
it isn't, these things are not intel-
I had a guy argue with me
this is what he told me,
he was trying to say about
AI writes and I was like,
"It's a computer."
"Seth you're talking like we
used to talk about animals."
It's a bunch of numbers
in a freaking network,
it has no feelings and no thoughts,
like you're thinking about,
"What about the ones
that generate art, Seth?"
They're just transposes of
these things because they
generate images forward,
is called a transpose convolution,
which generates a bigger.
I didn't even know how to respond,
I was like, "Oh,
do your computers have names, mind?"
Do I name him after Tolkien
characters incidentally?
The Cloud though, how
am I doing on time?
Oh, I'm doing pretty good.
The Cloud though, how do
we do this in the Cloud?
Well, here's the thing,
this is the way we do it
on Azure Machine Learning,
but look, the thing about
Azure Machine Learning
that I like is that,
you don't have to use all or nothing.
You can use it just to register
your models in that's it.
Do you version the
models that the machine
learning algorithms
produce? You should.
Usually as data
scientists we're like,
"Hey, I made a new
model," "Where is it?"
"It's on the Share,"
and then you have to go
home and take a long shower
because you know you
put that on the Share,
people are looking at
that file that you put it
there, and that's a problem.
So you can use it to
register and manage models,
you can use it to have
images of your things,
you can use it to deploy,
you can use it to prepare data,
you can use any of these,
whatever you want, just use what
you want, you use what you like.
If this add benefits you,
great, if it doesn't,
great, tell me why and I'll go
tell them at the home office,
"Hey, this stinks because Bill at
the Toronto AI User Group said this
doesn't work for me because of
this," and we want to fix it.
Oh, I went too far. So if
you take a look at this,
this is our Azure
Machine Learning Studio
where we do a bunch of stuff.
Now if you're a data scientist,
I'll give you example,
MSR once gave me some code,
this is Microsoft Research,
that generated novel pictures of
birds from texts that you write,
this is a bluebird and it would
create an image of a bluebird,
it did some cool gun stuff I
didn't know about at the time.
But as I looked at it,
I realize he was using like
a really weird version of
PyTorch on Python 2.6.
Any of you that have
had any experience
playing with Python on your machine,
it's a dumpster fire.
Maybe I just suck at it.
If any of you are like, "No Seth,
come afterwards and
educate me," seriously,
because I've had a lot of problems.
On top of that, I've had a lot of
problems because I use Windows,
just leave it there,
just let it hang out
there, it's getting better.
So he gives me this
thing and I'm like,
"How am I ever going to run this?"
I thought I'll just do the container,
do you use containers when you
do your machine learning stuff?
It's nice. It sucks right now and
I keep fighting the OS people,
you need to give me GPU
passthrough on my container.
>> [inaudible].
>> It's partly our fault too.
I keep telling them that,
they're actually working on WSL,
Windows Subsystem for
Linux, GPU passthrough.
So they are going to work on that,
so that would be super nice.
But I just wanted straight
Docker let me hit the GPU,
because if you've ever
run stuff without a GPU,
you have nothing to warm
your hands by in the dead
of winter, it's running.
So basically here's the thing,
and the reason why I bring
this up is because there are
certain things that we should
just share as data scientists,
like what datasets are we using,
in Azure Machine Learning you can
create datasets and version them.
So that means that you're like,
"Wow, how come your models better?"
"Well, I'm using
version 4 of dataset.
You can keep track
of your experiments,
what is an experiment?
It's a training run.
Not only that, you can rerun
it exactly the same way
anybody else ran it,
which is really cool.
You can also create pipelines,
for those of you who are
familiar with cube flow,
it's exactly the same but
we add a layer of SDK over,
where you don't have to do container
ops and you'd have to
know about Docker,
you're just, "Hey, here's my code and
here's my environment, do it."
We have this notion
of models where you
can store and version your models,
and we have these
notions of endpoints.
There's two kinds of
endpoints for us,
there is the endpoints for
inference and there's the endpoints
for rerunning Machine
Learning Pipelines,
why would you do that?
Well, because of data drift.
What if you wanted to
unstable your models every
week and rerun a pipeline,
you'd basically just call the thing.
The cool thing about these pipelines,
is it will retain information
that it's done, use before.
So if the parameters don't change,
it won't rerun that
section of the code.
You know that this
can be very costly,
if you're a rerunning
this step and it's
converting two terabytes of images
into one terabyte of GF records,
you don't want to
run that every time.
So it's able to know if
the parameters don't change then
these things stay the same,
it can do that, and that's
what the endpoints are for.
Now here's the other
things for managing.
We have Share compute
and Share datasets.
You're probably wondering what
is the difference between
data store in a dataset?
Great question.
It's like the difference
between a hard drive and
a folder for data science.
In this folder is this dataset.
The datastore for us is
basically Azure Storage,
you can just dump everything
there and you can run
these algorithms and it's
all happening in the Cloud.
Here's the cool thing
that I didn't tell you,
when you run the experiments,
it will mount Azure Storage as if
it was part of the hard drive,
it's crazy, which is really cool.
So now I'm to the part
where I'm going to answer
any questions you have
about what I've told
you so far on Azure Machine
Learning or anything.
So are there any questions?
We good? Thanks everybody.
[MUSIC]
