Sometimes it's easy to
forget that building a model
is not actually the end of
the story.
It's really where it really starts
because now you need to put this into
production and you need to integrate
the predictions, for example,
with other pieces of your infrastructure.
And then you need to make sure that the
model is these current and then then
there's not getting worse over time.
And this is really what the whole model
ops solution of RapidMiner is for.
So let's get started at the basics
first let's start with the deployments.
I will show you how to deploy a model.
What kind of different
model types are there?
Sort of like we call them Champions
and or Active models and Challengers,
how you can see how this
model was created. Again,
this is really important for
compliance reasons and well,
how are you really work with
the model ops in the overall.
Ok, so like you've seen before
with Turbo Prep and Auto Model.
Here's another, a prospective
review here at the top,
which is called deployments. And I
already have one deployment in here.
In this location, you can have
multiple deployment locations.
Let's ignore this for now. I'll come
back a little bit later to this one here.
Let's add a new one and let's
call this oneArrival Delays. Okay.
So we will generate this new deployment.
And if you do this we can see that we
still only have one active deployment
here.
This is not active yet nor could be
activated because there are actually no
models in this deployment. But I can
click on it and you can see, yep,
no model is not good.
So let's go back to Auto Model then and
let's take those two models we have for
credit before you remember the linear
model and the year deep learning model.
Those have been pretty
good. And let's deploy it.
And all you need to do is
click this button here.
You can change the name if you like.
You can pick the location and then, yup,
here's our new deployment. Let's
add our model to this. So there's,
we'll take a couple of seconds.
And what's happening on the backround
is that we're not just deploying the
model, but we're also generating all the
necessary workforce and processes for
integrating the model into other IT
systems for checking if the model is still
working for collecting all the
information and data about the models.
And this is all happening
while we deploy this model.
That's why it takes a few
moments. But after it's done,
we will see that we have our first
model in this deployment and this new
deployment. And since it's the first one,
it automatically became
the active model. So,
let's add another one and
see what happens then again,
or it doesn't matter where you click
or lets deploy the deep learning model
here. And if you do this,
if you edit to the same
deployment you will see that,
the second model that
become the Challenger model.
So what is the idea behind Active
models and Challenger models? Well,
the active model is the one
which produces the predictions.
But whenever you use the model,
or the deployment for scoring,
no matter if you upload some data and do
the scoring that way or if you actually
use the automatically
created VAP services.
The Active model is the one
who produces those predictions,
but all the Challengers are
producing the predictions as well.
They are just not delivered,
but they can still be used to calculate
how well those challenges perform.
And you'll see this leader and this
demonstration that this can be very useful
because if this Challenger
becomes better over time,
all you need to do is click here and
change this one to the Active model to
replace it.
Finally you can also see the DDoS of
each model. So I can click on this model,
I can see the model
which has been generated.
I see a snapshot of the full input data
and that actually is really important
because this input data here is not
just a reference, it's a full copy.
And why is that important?
Because otherwise the data could
have been changed in the meantime.
But if you needed to prove how
this model was, it has been built.
Think about GDPR in Europe for
example, and the right to explanation,
you better know all the details without
having any options that anybody could
have been changing
anything. In the meantime,
all the other results have been produced
as well and you can explore them here.
And that even includes the
generated process here.
You can load this back
into the design view. Well,
that's not very exciting because it's
pretty much the same process as the one we
have seen before. But again,
you can really prove how this model
has been created when we talking about
scoring.
And we also need to talk about how to
explain the predictions and machine
learning models is doing. I mean,
for some models it's really easy,
like for a decision tree, you can
pretty much follow along as you mean.
But for most models it's just not,
especially for the more powerful ones.
And let's be honest,
and I sometimes ask people
if they understand or how
exactly a Linden regression
model is working. And most
people wouldn't know that either.
So the whole topic of explainable AI
became really important. And yeah,
RapidMiner totally believe in a
complete no black boxes policy.
And that means for both, it's for
how the models have been generated.
So we saw this before. We can
always open up the processes,
but it's also about the predictions
which are created by the models.
So let's have a look into how
those predictions are created.
And then, what do you
call it, the scoring,
how we can explain those predictions.
I really have been showing
those model-specific but
of model-agnostic weights
before. And then there's another
thing we call the model simulator.
I personally like a lot actually,
and many of our users do too. Okay.
So here's our two
deployments again. For now,
let's actually switch over to this one
here because we just have a little bit m
ore data already in this one
here. It's a bit more interesting.
So let's go into the flight
delays deployment here.
Ignore the dashboard for now.
We will come back to this later,
but I will go into the scoring
here. So there's two ways.
So one is we can do is create scores.
By integrating this
deployment into this model,
into other systems way of X services.
We will discuss this leader or you
can just upload some data like this.
So for example, today is, October 3rd,
the day I'm producing this year. So
I thought, let's take the 2008 data.
So remember we trained the model in 2007.
Now we take the 2008 data also
from October 3rd and we load it in,
so here's the date or I mean
not using the arrival time,
arrival delay obviously,
but we detected that we have this target
column already and if we If you have
this, it's kind of useful of course,
you can calculate our rates right away.
Okay, so let's feed this data
in here. Let's do the scoring.
And after a couple of moments we will
get all the predictions together with the
confidence values and also
the explanations like the
ones we have seen before.
We are using a faster variant of the
so called LIME algorithm for local
explanations. Here it's faster,
so it even works in real
time. You'll see this in the,
in the simulator a little bit later.
And it's delivers you those supporting
and contradicting colors like you've seen
before. So for every prediction,
so most flights are on time,
here's one is delayed you see exactly
what was driving this prediction.
And as I mentioned before, columns which
are in general a little bit bolder,
I support most of the predictions or
contradict most of those predictions.
Those other columns which often
have a higher model-specific rate.
In this case we knew the actual value
already. If that is not the case,
you can also define the actual outcomes
for given a IDice which is really
helpful to calculate those
error rates. We see a little
bit later here as well.
So let's move to the
simulator. Before we do this.
We see in general those delayed
columns are a bit more important.
We don't need those scores any
longer, so let's just throw them away.
So the simulator,
what this is doing is we see the input
factors on the left side and then we see
how the model behaves on the right side.
And just by playing around it really can
be helpful to understand how the model
works. So we saw the, the delays are
important, we see this here as well.
If I just increase the
delay here for example,
in real time you see how the
model behavior is changing
and also what is driving
those predictions. Again, those local
explanations are seen here as well.
You can even turn this into
prescriptive analytics. For example,
if the flight already started late,
but I want to do my best to actually
bring it back to an on-time flight.
What can I do?
I can run this optimization here now
which is then telling me how would I need
to change certain input factors to bring
this back into an on time flight as,
as good as possible. So after
this optimization is done,
I can press the finish button here
and then the optimal values are,
are taking over here
and as we can see, yeah,
you still have a chance to bring this
back to an on-time flight. How we do this?
Pretty much the only thing, I
compare this to the average,
is by bringing the taxi out and taxi
in times here to pretty much zero.
While that's better,
make sure that we get a gate close to
the place where we are landing then cause
otherwise it's getting even for us.
That's the only thing we really
can do then at that point in time,
I mean bad news. Now lets turn
our attention over to the year
and model management aspect of it.
So sometimes you actually have been
building models on a certain data set and
then after some time the
performance gets worse and worse.
This is often a result of changes in
the world and you can detect this by
looking into the drift of the inputs.
Meaning how different are the
inputs now than they used to be.
And then there's another
element to this. Like,
sometimes you're using information and
not using the information you should have
been using or not have been using.
And that leads to some bias in
the selection of your input data.
And there are really two sides
of the same coin actually.
But if you want to do
ethical data science,
often you need to avoid this form of bias
or you should better be aware of this.
So this section will show your how to
identify those input drifts how to do this
bias detection and also explore the
differences in the difference tools.
All right, so let's start
with our deployment again.
The first indicator often
already is, in this models tab.
So if you just compare the reasoned
arrow with what you expected when you
trained the model, you can see
that in for the fast large margin,
which is a linear model. It is a little
bit of a risk, but not as much worse,
while for the deep learning
model, while it started very good,
it's no longer that good. So, there's
probably some drift happening.
And that's very typical for deep learning
models as well because they they're
just not very robust against
changes against the world.
So this is one way to see this, but
then there's an extra tab up here.
We call this drifts. We have you
show who you are for each input,
like what are the inputs, where there is
the biggest difference. So for example,
remember this column you have
created this total flights column.
So between the training data, the
dark greenish here and the data,
we can see that overall there seems
to be less flights, now in 2018,
going into Boston. So there
was a bit of a change.
That's actually the biggest change.
This column is not super important,
but still,
and you can see the same for nominal
columns like this one here again,
like we talked about bias before, so, Oh
well the differences are not that big.
But here is for example, a
couple of new flights coming ,
in from a state which
wasn't existing before.
And that can again also be an indicator
for bias because maybe you train the
model on cases, or you didn't train
the model of case because he left out,
maybe complete geography or something
like that. Input drift can be a problem.
It doesn't have to be. So for example,
this is the one we have seen before,
the total flights here and
going back to this one.
But it luckily is not a very important
column. So it's for the, here, the,
the bottom left. But there's
really important one, the NASdelay,
pretty much behaves the
same. So it's not a problem.
You really want to avoid any columns
in the upper right corner here.
So it's good that we can
highlight those drift problems,
our potential bias problems here.
Obviously in combination with
being able to see the details,
how the models have been created you
remember from before that really together
can help you to figure out
if there was a bias problem,
if it's just a different problem and
maybe retrain the model or take other
measures if needed.
Let's not dive in deeper
into model management.
I mentioned already before
that with RapidMiner,
you can actually close this feedback loop
so we can upload those actual outcomes
or you can use it vectors for that. You
will see this in a couple of minutes.
Closing this feedback loop is really
essential because if you know the actual
outcome or at a later time after you
created the prediction and everything,
you can then do back tasking and
you can do proper model comparisons.
So I will show this to you and
then the deployment dashboards,
the leaderboards among the models that
we have seen some of this already before,
and we focus a little bit more on
this performance dashboard here,
which also can help you to understand
the business impact of the models.
Okay, so that's done for the dashboards.
Here we see a couple of main KPIs, like
how many scores did we generate so far,
what is the total gains? This is
the, the orange line down here.
So like how much did we
generate here so far?
Like what's the error rate? Are there
any alerts going on right now and so on.
So each bar is for the number of
scores we have created so far,
so that is this and the purple line
shows you the average error rate.
So it looks pretty stable over time. If
you already saw the model type before,
so let's focus on the performance then.
The performance total even
more details here. So again,
you would see the scores for each model,
but you also can see the error
rates for the different models.
So here we can see, okay, while we
generated a couple of scores every day,
overall, while our deep learning
model is doing a half decent job,
our fast large margin model
actually becomes better now.
So it's probably about time now to
turn this over and make this the active
model, but it comes as a price.
It is a little bit slower for scoring
then the deep learning model, in fact.
So you can see the different
distributions here for the extra classes,
which again could be another
indicator for a potential drift here.
If the growth has changed. In this case,
it's more likely to be an outcome
of the different cost matrices,
the or costs we have defined before.
And talking about costs, here,
the purple area here, that is the
best option without any models.
So always going with on time. You
saw this already in the Auto model.
So here we can see what is
the impact really over time.
If you wouldn't use any model,
like while we would make
roughly 20 million here and
over the course of the last
30 days, we can always
change the timeframe here,
the cumulated one then is what
the model delivers on top of it.
So as we can see,
it's a almost 30 million in this case
and this difference between those two
areas. That's how we calculate the gain,
which is the orange curve here which
is then almost 9 million for this
timeframe.
And that's important for making the case
for your model to make sure the model
is still delivering value and it's not
about accuracy is not about error rates.
It's really about that and
pretty much nothing else.
All right, moving onto the last part,
which is about model operations.
Model operations is really
about automating all the
maintenance of around the
deployed models. So creating alerts,
being aware that something is
no longer working as it should.
Integrating the models into other
pieces of your IT infrastructure.
This is really what this next
section is going to be about.
So lets just jump right into this.
Yep, we have seen this before.
Let's move over to the alerts tab here
and you can see that I've already created
three different alerts here. The
first one is the drift alert.
If this drift is greater than let's say
8% for the last week and I check this
once per day, then that sends an
email to data scientist number seven.
And the same is true for error alert.
Or if you have less than a hundred
scores and on any given day, then again,
you send up an email. Whenever
an alert is triggered,
we can see what has been triggered
down here until we acknowledge this.
So at this moment this explains
why we have the seven alerts here.
If I, for example,
acknowledge all of them then this KPI
would actually go back to zero here and
everything is green. So let's go back to
our alerts and let's trigger a new one.
Check now. This error alert
will be checked at this moment.
Is the error rate higher
than what was it? 4%? Yes,
it was almost 7%. Okay,
so this has triggers and I can
just check my email inbox here,
or actually it's in fact data scientist
number seven email inbox and I will get
notified. So this is about alerts,
integrations is equally as simple.
You could see the URL
for the rep service here.
I run this on a local server right now.
And you can test this for some of
those values. You just test the,
the web service here. What is nice,
I get all the results here in JSON,
but we support actually more
than 30 different formats here.
So we can change this if you like.
As you get all this information here,
we even can de-activate the deployment
and that means the representative would
still respond,
but it gives you a proper error
message telling the either the system
integrating this web service that
currently the deployment is deactivated.
And then there's the second web
service here for defining the actuals.
And with that I would like to
wrap this platform demonstration.
We really managed to do a full project
on flight delays. It's a full AI project.
We learned a lot about our data
and we created a predictive model.
We put it into production, we
managed this model. You saw,
it's all very simple but it's also at
the same time stays very transparent.
And I said it before,
RapidMiner is the only platform
supporting everything from data prep,
automated machine learning,
down to model ops,
in this augmented and automated way.
But there is this whole other piece to
the platform that you can actually see
what exactly has happened in, in
with those annotated processes,
which comes in very useful and it's
often very important also for compliance
reasons.
Thanks for staying with me whole hour
here and I hoped you enjoyed it as much as
I did. Thanks.
