- Coming up, we'll look at the new
Azure Machine Learning service
that gives data scientists
the capabilities
they need across the
machine learning lifecycle,
within a familiar notebook experience.
Now we're gonna show you
how Azure ML makes it easier
for data scientists to build
reproduceable experiments
with machine learning pipelines
and communicate operational dependencies
to their engineering counterparts
as part of a new MLOps
approach as you deploy
to the Cloud and the edge at scale.
And how you can use the newly introduced
Automated Machine Learning
capabilities in Azure ML
to build machine learning models
in a fraction of the time.
(upbeat music)
So, I'm joined today by Chris Lauren
from the Azure Machine Learning team.
Welcome.
- Thanks, I'm excited to be on the show.
- So, Azure Machine
Learning is a new service,
but at Microsoft we've been
at this space for a while.
In the past on Mechanics for example,
we've looked at things like
environments for data prep
and experimentation and
analysis using things
like Azure Data Bricks as
well as various frameworks.
But how do we think about
Azure Machine Learning,
this service and really what's
new and what's different?
- In the past if you wanted
to use machine learning
in production settings, you
needed to bring together
a bunch of different services to support
the full machine learning lifecycle.
For example bringing together storage
like Azure Blob Storage or
Azure Data Lakes Storage,
because you can't train
machine learning models without data.
Bring in some compute like
individual Virtual Machines
or Spark cluster using HD Insight
or Azure Databricks to run your code.
And then to protect your
data for enterprise readiness
you'd bring in your virtual networks
or configure your compute
and data inside the same
virtual network and bring
in some Azure Key vault
to manage and secure your
credentials for example.
And then if you wanted to
repeat your experiments
using a consistent set of
machine learning libraries
and the different versions
thereof, then you'd create
docker containers and use
Azure Container registry
to store those docker
containers and then put
that inside your VNET
and then to run all that
at scale you might use
Azure Kubernetes Service
and include that inside your VNET.
- Right and this all sounds
like a whole lot of stuff
to piece together to
get the machine learning
and all the models and everything to work.
- Yeah, it is a lot to manage
and you spend more time
managing all that stuff
than actually doing the data
science work, so with Azure
Machine Learning Service,
we remove that complexity for you.
As a managed service it
comes with it's own compute,
hosted notebooks and capabilities
for model management,
version control and model
reproducibility built right in.
You can layer that on top of
your existing Azure services.
For example you can plug
in the compute and storage
that you already have as well
as your other infrastructure services.
Azure Machine Learning
connects and orchestrates them
within a single environment so
that you have one end-to-end
modular platform for your entire
machine learning lifecycle
as you prepare data, build, train, package
and deploy your machine learning models.
- And as you mentioned
then, for the data scientist
there's nothing to worry
about there in that case.
You're basically getting
the services and everything
configured for them and
not having to compile
all these individual services together.
- Right, but the best
part is you don't need
to learn how to use new tools.
You can use Jupyter Notebooks,
your favorite Python editor
like VS Code or PyCharm,
and Azure Machine Learning
works with any Python
machine learning framework library
like TensorFlow, PyTorch,
SCIKIT-Learn, and more.
- This is awesome in terms
of making that lifecycle
a lot shorter, in terms
of getting everything
up and running but for
all the Cloud operators
and engineers running that
or maybe tenant admins,
how do we ensure that our data scientists
aren't consuming too much resource?
- Yeah, because it is an Azure service
we have integrated
role-based access control
built right in that you
can fulling customize
in any way you like.
So for example, you can
add a new data science role
so you can grant access
to a data scientist
and you can simply facilitate
them to run experiments
but not create or edit
compute for example,
so they can do their job
without running up your bills
in a way that's not approved.
- Right, and the cool part of this service
is something brand new, MLOps.
Now this is a new concept
for a lot of people.
Can you explain what
MLOps is and how it works?
- So what really differentiates
Azure Machine Learning
is that we have an
integrated DevOps approach
for what we think of as MLOps.
It makes it easier for data
scientists and engineers
to work together,
because engineers already
have a good understanding of
how this continuous integration
and continuous deployment process works
and the data scientists know
how to train great models.
And so by enabling them to work together
we can ensure that we have
high-quality models at
scale in production.
With MLOps incorporated as part
of the Azure Machine Learning service,
we add experimentation.
This means for data scientists
that reproducibility
is incorporated throughout
the Machine Learning lifecycle
in your training, test and
production environments
and they can create discreet
pipelines for each model
making them easier to track and reproduce.
For example, let's say you have
a wind farm that you manage
and you want to ensure that you can both
optimize the energy output
and have a predictive
maintenance scenario
so that you can ensure
that you can avoid any downtime.
And so each one of these pipelines
can help you consistently
built, train, package
and deploy machine learning
models to different
windmills on the edge and
iterate as new data comes in,
new telemetry signals, so
you can train and deploy
new machine learning model
versions to make sure
that you have the best
quality models in production.
- Right, so this ML Ops-centric approach
then makes a lot of
sense but what do you do
in terms of getting the
service up and running
for the first time?
- So this an Azure service
so you would simply
go to your list of services
in the Azure portal,
log in just like in any other service.
And then you can search
for Azure Machine Learning,
you can select the workspaces here
and if you're just getting started
you can easily click add and create a new
Machine Learning Workspace.
You can see I have a list
of available workspaces here already.
So I can simply select
one and then I can see
that I have a list of
machine learning experiments
that my team has been
working on, I have a list
of compute targets, whether
it's like Kubernete Services
to deploy my models of Azure
Machine Learning Compute
to be able to train my models.
And I've got a list of
the models themselves
as well so I can keep track of
what's deployed where and
how well it's performing.
Now, to get started on an
experiment or to continue
working on the one I already
have, I even have lists
of Notebook Virtual Machines
integrated directly here
so those are like Cloud
workstations if you will.
So I can log directly
into the Jupyter notebook
from the portal here or
access the URL directly
and open a notebook that
I've already been working on.
You can see I've got a
bunch of open source Python
libraries like pandas and numpy etc.,
and the Azure Machine Learning Python SDK
which I can install and
use on my local laptop
or use this Cloud workstation as well.
- Right, and the nice thing
is the notebook environment's
really perfect for
collaboration and it's good
to see that Azure Machine
Learning effectively
plugs right into that and it's a tool
that I think most data
scientists are used to.
But can you walk me through
then a real scenario
and get some experiments and
some models up and running?
- Sure, so we've got
Contoso Auto as a fictional
car manufacturer, right?
And they are collecting IoT
data from all their vehicles
that they have deployed
all around the world.
Customers have bought and they're driving
but they're getting a lot of calls,
a higher volume than normal of complaints
that they're having issues.
Now we can drill into
this Power BI dashboard
that I have here and we can
see that for the different
vehicles that I have out in the field,
I've got an issue with the starter motor,
there's already a recall in progress
and I've got an issue with batteries
so I'm gonna drill in
here to see some issues
and I can see that I've
got a higher than expected
rate of failure for these batteries.
So I'll show you how to
use Azure Machine Learning
to predict what kind of
failures are likely to occur
so we can notify customers
proactively to bring
their vehicles in for
maintenance before issues happen.
- Okay, so now then you can
use these these data points
effectively and like
machine learning in general
it's really doing some
predictions here in terms
of customers in this case
that might have a likely
failure so that we can call
them before they end up
on the side of the road
needing to call a tow truck
which'll be even more expensive for us.
- That's absolutely right.
So to get started with
Azure Machine Learning
for this scenario,
first we need to connect
to the Azure Machine Learning workspace
and again that's the thing
that's really bringing
together my compute and data
and everything all together.
I'll create an experiment
so I can iterate over
and train multiple models and keep track
of their quality score,
and I'll get access
to some of the data that I have available,
that telemetry data that I mentioned.
And I can register those data sets
in Azure Machine Learning to make them
really easy for my data
scientists to find them.
Then using standard approaches
in the Jupyter Notebook
I can interactively explore
the data and get some
intuition about what data might be useful
to train a high quality model.
- Cool, so this is now where
you might need to experiment
with different models but
in classic machine learning
with the experimentation
that's gonna take a lot of time, right?
- It can, yeah, and typically
this could take days
or maybe even weeks of experimentation.
You'd have to take the
data set, across the age
of the battery, the
temperatures, run time etc.
and experiment with different algorithms
and hyperparameters to train
the model and then repeat
the process a bunch of times
because the scientific method
is just kind of guess and check.
- Right, right, right.
- But now there's a better way
with Automated Machine Learning.
- So can you explain
how automated ML works?
- [Chris] Automated ML generates
different experiment runs
using a combination of
different algorithms
and hyperparameters and
trains models in parallel,
returning a quality score for
each model after each run.
Then based on what it learns,
it'll keep generating
different experiment runs
with different combinations of algorithms
and hyperparameters to try to
train an even better model.
- So it's actually then
using machine learning
to select different experiments
and really automatically
without machine learning
find the right model for you.
- Exactly, so now as a data
scientist you can focus more
on fine tuning the models,
versus a lot of the manual effort
around trial and error.
- But I can see where
this might get expensive
from a compute perspective,
how does that work?
- Well it depends.
Some models use a lot of
compute cycles to train.
However, this approach is more efficient
because instead of your
data scientist manually
trying different
combinations of algorithms
and hyperparameters over and over,
with automated machine
learning we can do this process
in parallel in a fraction
of the time to save
you even more money.
So for most data scientists,
we're coming from a world
where everything's running all the time
and so I might run a job
before I go home at night
for example and then the job
might complete five
minutes after I leave--
- Right and you're leaving
that computer on all night.
- Exactly, and so with AML
compute will automatically
shut down the machines that
aren't being used here.
- So how's this all working
then under the covers?
- [Chris] When I run an
experiment, Azure Machine Learning
will take these different experiment jobs
and insert with into a queue, right?
And then as the queue forms,
this kicks off automatic
VM provisioning in the cluster.
So the cluster starts of with
zero VMs in it, then it grows
kind of on demand as
the workload increases.
- [Jeremy] Makes sense.
- [Chris] And so then we run as many jobs
in parallel as we allow from
that compute quota limit
perspective and then as
those VMs get saturated
the remaining jobs stay in the queue,
it'll drain the queue
after jobs get complete,
it'll pick new jobs,
it'll train the models
all in parallel, and then
as we drain the queue
completely then we'll
automatically deprovision
the virtual machines so
you don't pay anymore.
- So it's truly elastic.
That cluster had been paused effectively,
it provisioned everything
out and then it contracted
and paused again when you were done.
- That's right, and you still
have the full flexibility
of configuring or choosing
what VMs queues you use,
so you might use some
with GPUs for example
to facilitate deep learning
jobs or pay for CPU VMs
which are less expensive
if they're more appropriate
for your workload.
- All right, so the
concepts make a lot of sense
but can you show us this in action?
- Yeah, so to run those
machine learning jobs
in the AML compute cluster,
first I'll grab a handle
to this compute target which
has already been created.
But I can create it from the
Python SDK here in Jupyter
if appropriate, and then
I'll create an automated ML
config file that specifies
that this is a regression
task, because I'm using
that data to predict
how far in the future the
battery is likely to fail.
And I can set some constraints
to control my costs,
like how many models or
iterations do I wanna run
and how many of those
should run in parallel,
how long each of them should run for
so I can control my cost.
- [Jeremy] It's like your
config parameters file, yeah.
- [Chris] Yeah, exactly, and
then I'll run this by calling
it experiment.submit and
then I'll grab a handle
to the run object and then
I'll visualize it directly here
in the Jupyter Notebook and I can see that
Azure Machine Learning ran a
bunch of these in parallel.
I can see the quality scores
and details about each one of them
with a nice visualization.
I can see that there's a clear
winner here based on the
metrics that I've chosen.
- Right, right and the darker
green meant that was the best
model run so now how do I go
about taking it to the next
step where I wanna package
them up and get them deployed?
- So this is one of the best parts
of the Azure Machine Learning service.
It's super easy to
operationalize your models
and deploy them in production.
Here I can take this machine learning run,
I can grab the best one of
those iterations automatically
and then I can publish that
to the Azure Machine Learning
model registry just by
calling it register model
and I can give it some
tags and descriptions
and then when I do that
I'll actually see that
in the workspace here, I
can see the list of models.
And so this is the model
registry and so this is how
I know what models my
team has and what models
I might want to deploy and even see
where some characteristics
about each one of those and
see where they're deployed.
So in this case this one is deployed
to an Azure Kubernetes service.
I can even drill into the
details of that Kubernetes
service deployed model
where I have a REST API.
I can get the scoring URI,
I can recycle the keys
for security purposes if I want
to, and that sort of thing.
But let me show you how easy it is
for a data scientist to
be able to deploy this.
You simply create a scoring
file which runs the model
and then I can package that
up with my dependencies
that I need and this will
generate a docker container
for me that I can then deploy
to my Kubernetes cluster
by using the inference config.
And just specify my scoring script
and my dependencies yml file
and then call model.deploy.
Once it's deployed,
then I can test this out
and I can get the
response from the REST API
end-point that we automatically generate.
And then I can integrate that
with my Power BI dashboard.
So instead of just a list of
potentially affected vehicles,
then I actually can get
the predictions by using
that REST API end plane.
Not only can I get the
predictions but I can even
get the list of the predicted features
that explain why that model
chose those particular
vehicles so when I call
the customer to bring
their vehicles in, then
I can explain to them
what characteristics
picked them if you will.
- So this looks really
powerful but what if I'm not
familiar with things like Jupyter,
I can't write Python.
Are there things that
I can do that are a bit
easier to get up and running?
- Yeah, we hear that a lot and that's why
we've created the new user interface
for the Automated Machine Learning
experiments to get started.
So here you can see I've got
a history of the recently
run experiments but I
can also simply click
create experiment and this will
create a new experiment run.
Either in the same experiment
or a new experiment.
I'll select the same one
I was using before here
and the same compute cluster
that I was using as well.
I'll click next and then I'll
be able to choose the data
that I was using and then
I'll be able to get a quick
preview of it, I can
select the type of task,
again this is a regression
task, and I can select
the column that I'm trying to predict
because I want to find out how long
it's likely to survive in days, right?
- [Jeremy] Right.
- [Chris] And so I can
click start and through
that kind of wizard-like
approach then we can train
new machine learning models.
However, while this is
running I can even go
into some of the details
of recently run experiments
where we can see some
of the characteristics
and we can even easily
click deploy the best model
just like I did in the
Jupyter Notebook earlier,
which will register the model
in the machine learning model registry.
- Okay, so you can track
then everything in one place
but what helps then with
things like reproducibility.
You mentioned earlier
that we're using things
like ML Pipelines, how do those
actually come into play here?
- Yeah, let me show you
how we can use the same
sort of approach that I
just mentioned and automate
the process of training
unique machine learning
model for each different type of
car year, make, and model that I have.
So we can see I just created
a list of the different
makes and models, then I'll
create my automl config file
specifying the type of task again.
And then I can create
these different steps
in my pipeline to run
the automl experiment,
collect the best model and the key metrics
out of that, and then
assemble them together
in a pipeline where I
append a bunch of steps.
You can run these in
serial or in parallel.
And this is super
important because sometimes
in a machine learning
experiment there can be
10s, 100s or even 1000s of
unique steps that I need to run.
And so by creating a
pipeline that I publish
and then I can just run it on a schedule
or by calling a REST API then it abstracts
all of the complexity inside the pipeline
so that I can just run that for each loop
for all the different makes
and models if you will.
- So this all totally
makes sense in the context
of the Notebook.
- Yep, and from the
Azure portal I can also
see the inventory, the list
of pipelines that I have.
I can schedule then to
run, I can see the details
about them as well like who created them,
their REST API end-point
so that I could call them
and see all the steps
that are in each pipeline.
- So a nice visible and visual
view here that you've got.
Now you mentioned earlier
that you can also integrate
with things like Azure
DevOps, how does that work?
- It works in two different ways.
Build Pipelines and release Pipelines.
The build Pipelines integrate
with Azure Machine Learning Pipelines
to facilitate training
and building the model.
You can think of that model
as like a build artifact.
So here I have a build
pipeline that then runs
the ML pipeline that I mentioned earlier.
And then once that machine learning model
is built and registered in
the Azure Machine Learning
model registry, then that will kick off
the release pipeline.
So we can see ACI can either run when
you check in maybe a new scoring script
or a new version of the
machine learning model.
And then we can have multiple tasks
in this test so it can run and say
Azure Container Instance
and then it can run
a bunch of different integration
tests with my application
as well as quality tests to make sure
that the model is good.
And then once that's done
then it can role this out
to my Azure Kubernetes
service, which will again
deploy that but not until, for example,
we've incorporated a human in the loop for
deployment approvals to make
sure for auditory compliance,
regulatory type of reasons
that we know exactly
what we're deploying and
have proper approvals.
- Right, so this is a
release pipeline you can
kind of do anything with it
and basically you can even
have things get deployed out to the edge
if you wanted to with the
same release pipeline.
Now one thing that's great
for our data scientists
that are watching, and we
mentioned earlier the integration
of things like, for example,
deep learning models
with TensorFlow, and you can also use
machine learning then to go
beyond the predictive models
that we just showed here.
What would we do then in terms of
taking it to the next
level and maybe using
some different data input
types to do deep learning?
- So for example if you had
a camera installed in the car
you might use that video
telemetry stream to then
train a model to identify whether drivers
are distracted for example.
And so we can take those images,
train a model using PyTorch or TensorFlow
in the Cloud, make sure that it's good,
then use IoT Hub to roll it out
to the individual cars
and then you can maybe
raise an alert in the
vehicle if the driver
is doing something that's unsafe.
- Maybe that they shouldn't be doing,
so really good stuff.
So this is an awesome intro
in terms of all the different
things that you can do with
DevOps as well as MLOps
and all the Azure
Machine Learning services
that we've just rolled out.
Where would people go to
learn more on how to start
using all of this stuff
and really consuming it?
- Try it yourself, that's
the best way to get started.
You'll learn more at the link shown.
- Really great tips and this is a space
that continually evolves and changes
and of course you're
gonna wanna keep watching
Microsoft Mechanics for the latest
tech updates across Microsoft.
Hit subscribe if you haven't already
and we'll see you next time.
(upbeat music)
