>> It's a real
challenge to create
global skill real-time
collaborative applications.
Each piece of architecture
you add increases latency.
Chris Anderson is here
to show us how we can
easily solve this challenge
using Azure Cosmos DB,
Azure Functions, and SignalR,
today on Azure Friday.
Hey, everyone. Donovan Brown here
with another episode
of Azure Friday.
I'm here with Chris Anderson.
Chris, how did you
solve this challenge?
>> Yeah.
So, when we have
this interesting challenge
where we wanted to
build a real-time
collaborative dashboard
or canvas that lets you
go through there and
then draw with other
people in real-time,
and we wanted to see
whether or not we
could use Cosmos DB to
solve this problem.
>> All right. Cool. So,
what is this app here?
This is just a demo app, right?
>> Exactly. So, the idea here is,
I want to be able to
go ahead and draw on
this T-Shirt here that we've got,
Scott's lovely red T-shirt.
Then, I want to go
ahead and you will see
it show up in near real-time
on my other screen.
So you can see here,
as I'm drawing,
these updates are showing
up on my other screen.
I can even go in through some
extra colors inside of here.
What's cool is I can do this
on either one of these.
I can also go through
here and draw
on this one straight across,
and we'll see this
show up like that.
So, these are
two different screens.
It's going up there,
and it's got to go to
the cloud and come
back down again.
How are we able to get
this to happen quick?
>> Great. How many people could
be drawing at the same time?
>> We designed
this thing for Build.
I think we planned for Build 2X.
So, like 20,000
people first went?
>> Wow.
>> I think we've quoted like
20,000 people as
our original specs.
>> Awesome.
>> We wanted to do it without
any provision VMs basically.
So, we're trying to go for
the managed services approach.
>> Got you.
So, this is very fast when
you're talking about functions.
>> The serverless thing in there.
>> Perfect.
>> Yeah.
>> So, show me how is this
actually architected then?
>> Yeah. So, in terms
of architecture,
it's also simple
in a lot of ways.
We've got the client, of course,
which is just the browser
that you're seeing there.
The client is talking
to the API for
doing login and
for doing inserts.
>> Okay.
>> That insert is using
the user's identity
to also throttle,
the people who are
using it via Twitter.
When you log in through Twitter,
we didn't know who you were.
>> Got you.
>> Things you wanted
to do. So, we
wanted to throttle you down,
and so we're able to do
that by using App Services,
identity and
authorization service
to actually go through
and figure out,
okay, they're from Twitter,
they're from AD, who are they?
>> Got you.
>> Then, we're able to
track that using Cosmos DB.
>> Very cool.
>> It's taking all that
data down into Cosmos DB,
and then we're actually having
a SignalR app go through
there and pull all that
from the Cosmos DB change field
which allows you to see
all the different updates that
happens to the Cosmos DB,
and then piping those
directly to the client.
That's how you get
the real-time effect.
>> I see. So, instead
of me having to go and
run a query against
Cosmos saying,
okay, show me everything so I
can figure out where I was,
you're just getting this feed of
constant updates saying
this is where we are,
this is what you
need to know about,
and offloading the querying part
that's no longer my concern.
>> Exactly.
>> Got it.
>> If you know you had
to write that query,
it could take a long time
whenever you go through there and
search and every document
for the right one.
With this model, it's literally
just an ordered log of
things that changed.
>> All right. Perfect.
Yeah. Because, I mean,
the more data you get,
the longer that query
takes to execute.
Everyone starts to slow down,
but because we're getting
this change feed,
everyone stays, and like
you said, near real-time.
>> Exactly.
>> Got it. Okay.
>> Then, the other challenge that
you might have if you're building
an application like this was in
terms of collaborative drawing,
as you need to basically
sink to a source of truth.
You don't want someone to have
to go through there and replay
the whole log every time they
connect to their client.
So, we have this other function
here which is basically
creating a Snapshot
for us, about us.
>> Okay.
>> So, every time that
you were to add a pixel,
it's not only being streamed
to everyone else immediately,
but it's also being saved into
a blog which represents
the current state of the board.
>> Okay.
>> That allows us to load
the client very quickly
without having to go
through there and
then replay the whole history
and see the thing animate
with every single drawing.
>> Got you.
>> But the cool thing of
this is since Cosmos DB
is storing every single insert,
I'm actually able to
replay the whole thing and
see how it was drawn
pixel by pixel.
>> Interesting. But the Snapshot
allows me to join
the conversation
late and then pick up right where
everyone else is because
I get that Snapshot,
and it's fast because
I'm not replaying it.
I might do. This
is where they are.
Join the fun and
start drawing and
then everyone starts you
see my real-time drawing.
>> Exactly.
>> Got it. Okay. So, whenever
people say Cosmos DB,
it's like this magical
black box, right?
But Cosmos has a several
different functions
or ways to interact with it.
What portions of
Cosmos DB that we
use that made it
Cosmos the answer,
not the second random SQL server
or something like that?
>> Yes. So, in this case,
the key pieces I
wanted was I wanted to be
able to store documents
because I could have
moderate it a couple of
different ways storing batches
of pixels in a lot of cases.
When I was doing the sketch model
that we were allowed
to do for the people
who are presenting on
stages and whatnot,
we wanted you to go
and insert large
batches of those things.
That was really better
representative,
it's like one document that
represented that batch line,
then as a bunch of
individual records in a table.
>> Got you.
>> So, I would have
had to have like
stroke ID or something like that
to be able to track
them together.
Documents really
just made sense for
the kind of data we
were trying to store.
>> I see.
>> The other thing that I
needed to have was I needed
to have real-time updates.
Instantly enough, like this is
actually not an original idea.
This is an idea that
we were able to
learn from the people who
built the Reddit Place.
So, they have this awesome
blog posts where they
went through
all the trial and error.
It has saved me a lot
of initial issues.
One of the cool things
here is you can see
what an architecture looks like,
and you'll see it looks very
similar in a lot of ways.
They're using Cassandra in place
of where we use Cosmos,
they're using Cassandra.
They're actually using
Redis and we're also using
Cosmos for that
because Cosmos has
the same latency as Redis.
>> Okay.
>> So, we didn't have to end
up having to go ahead and
use some lower latency database.
We're able to continue
to use Redis for that.
Then, they're using
WebSocket service.
We're using SignalR.
>> Okay.
>> They probably had a bunch
of servers they had,
we just use functions
because that
way it automatically scale
up to what we needed to be.
>> Perfect. So, what we're
able to do is simplify
this architecture because where
they had to use
multiple services,
we were able to get all
that power right from Cosmos DB.
>> Exactly.
>> Got it. Okay. Perfect,
perfect. Very cool.
>> So, yeah. So, from there,
we can talk about each of
these individual pieces.
>> Okay.
>> So, the quick way
is to go in TED,
look to see how do we insert
those records from
the client, right?
>> Perfect.
>> How would that
function look like.
Let's take a look at the API
and see how we're actually
inserting the pixels.
>> Okay.
>> So, from here,
you can see that we've
actually got update pixel.
This is just a function
written inside of
the C# functions project type
that Azure Functions has now,
that generates
a function that JSON
for us. We don't have
to worry about it.
>> Cool.
>> Inside of this, it's
an HttpTrigger function,
you can see that we've
got the attribute here,
and we're actually doing
anything else inside of
this, no other bindings.
One of the things I'm
using here is I'm
actually using
the Cosmos client directly,
so I have to change
the configuration on it.
That wouldn't otherwise be
exposed to the binding.
>> Okay.
>> So, one interesting thing here
that I'm going through
there and I'm just
ensuring that we've actually
got the CORSRules applied.
In order to do the kinds
of interesting things I wanted to
do with CORS and passing
credentials back and forth,
you actually have to turn off
the CORSRules that App Service
has out of the box
because they're simple,
and you have to take
over CORS fully.
>> Amazing.
>> So, I've just got
just this nice helper method
here which goes through
there and fixes up
my course settings for the rest.
>> Okay.
>> From there, I'm actually
going through there and
checking to see whether the
user is authenticated or not.
>> All right.
>> I'm using App Services
authentication authorization
feature that they've got,
which it will basically prompt
people for logging in with
Twitter if they go
to the site and they're
not authenticated yet.
Well, if they click
on the log in button,
you can visit the site
unauthenticated.
But if you want to insert pixels,
I prompt you, hey,
you should log in.
>> Sure.
>> Then, it redirects
you to the log
in and then it comes back.
>> Got it. Okay.
>> With that, I'm just going
through here and I'm
reading some headers,
and that's allowing me to
go ahead and get the ID.
Basically, what
the identity provider is.
I have it kind of a difference
between how I handle
AUD log ins which is what I'm
using for the demo presentors.
They're longing for
the AUD credentials.
Then, for Twitter users,
they can log in and
I'd see that they have
identity providers Twitter here.
>> Got it.
>> From here, I'm
actually going through
and reading on all of my pixels.
Like I said, I had
this interesting challenge
that the Reddit Place
folks didn't have
where I wanted to actually
handle canvas brush strokes
as well as individual
pixel updates.
>> Okay.
>> So, that means I had to handle
basically an array of pixels
being uploaded to me
from basically a
single batch stroke.
>> Got it.
>> So, I'm just going through
there and unpacking this.
You can see if there's a problem,
I just send back
a bad request message.
>> Okay.
>> Then, from there, I'm
then going through there
and then doing user throttling.
So, I've got stuff where I
can actually block users
if you're trying to make
little bad pictures
inside of there.
I was able to add you
to a naughty list-
>> Nice.
-and you got removed.
If you are an admin,
which is written now just
set to if you're using AD,
you are an admin.
If you want Microsoft
was just how to draw
whatever they wanted to draw.
>> Exactly.
>> I set that and then,
if they're not an admin,
then I go through there
and I check to see whether or not
they can actually do the insert.
So, I go ahead and grab
their last insert time,
and then I update the user if
they're still allowed to go.
>> Okay.
>> So, basically,
I update the user
before I actually
commit the pixel,
that way there's not any weird
trick where somehow they
are able to insert by breaking
my user permission logic.
>> Got it, so now we
know who they are,
and we know if they should
or should not be throttled.
This information is
all being pumped
right now into Cosmos DB.
>> Exactly.
>> Right.
>> Then, are you going to
show us really quickly how
the viewers or the people who
are watching it you'd draw.
Where we're reading then.
Why did you call it the real-time
feed of data? The change log or?
>> Yeah, yes. So, now after
I insert those things by,
basically I've got
this pixel service,
which is acting as my singleton,
which is damaging
the Cosmos connection for me.
>> Okay.
>> You can use singletons
inside of Azure Functions
to make sure you have
a single client and
that prevents you
from opening lots of
different requests,
and having socketing surge.
>> Sure.
>> Things along those lines.
It also increases
the speed if you
create a new client
for each time,
it's going to slow it down.
So, in this case I've
got a hot warm client
that I can use for most requests.
>> Okay.
>> Then, I insert
that batch, then it comes
back and we should
hopefully be fine.
We should never hit
this 500 hopefully.
Then, the initially thing
here is that the end of
once again I have to fix at
the cores rules, once again.
So, soon as you take
some manual control
over the cores rules,
we know that can get
complicated and with functions
because there's
no like middleware plugins
you can use to take
care of this for you.
You end up having to
bright smile to yourself.
>> Got it.
>> But, that's
a whole another episode.
So, in terms of how we
then read those updates,
we've actually got
two different projects.
So, if you remember
from this screen here.
We've got the signal
of our project,
which is sending the updates
to the client directly.
>> Right.
>> Then, you've actually got
the update processor function,
which is then storing
it into a block.
>> Great. are they both
reading the same change feed?
>> Exactly.
>> Okay.
>> Exact same change feed.
The interesting difference
between the two of them
is their function app is actually
using the change feed trigger.
We have a set of Azure Functions.
>> Okay.
>> So, you don't have
to write a line of
code related to Cosmos.
You can get these updates using
Azure Functions
change feed trigger.
>> Okay, cool.
>> Through using
SignalR, you have
to use the change feed library,
and that involves
a little bit more specific logic
about understanding how
Cosmos is going to work
and how its libraries
work together.
>> We know.
>> Functions is going
to obviously try to go
the extra mile to
distract you away
from what service
you're talking to.
>> Okay.
>> So, we can start
with the signal of
our projects, instead
what's interesting.
So, you can see here we've
got the change feed reader.
This is the one that I was
actually going through there,
and pulling our change feed
by using our document client.
>> This is the SignalR app,
that we're talking about here.
>> Exactly.
>> Okay, I got it.
>> So, you can see here we've
got the document client.
I'm actually passing
in my iHub contexts.
So, my SignalRHubContext is
getting passed in as well.
>> Okay.
>> As part of the constructor for
creating my change feed reader.
There's a library
that we have called
change feed processor,
which is actually a library,
which helps you go through there
and like scale-ups many
machines to do this.
That was used if you want
to process a single message once.
But, SignalR is interesting.
In the case of SignalR,
we are actually trying to
for all clients and all events
that possibly happened, right?
That's at least for
the design we chose here.
We could have had different
rooms and stuff like that.
>> Sure.
>> In this case we wanted
all events from Cosmos DB to
be going through all
clients of SignalR.
>> Okay.
>> So, in this case,
we actually have
the opposite problem,
where instead of
having that wanting
event to be processed once,
we wanted an event to be
processed by all clients.
>> Got it.
>> So, that's why
we wrote this thing
which inverses the process,
we're effectively reading from
the change feed from
all partitions,
and then sending it out over
all listeners for
this particular SignalR hub.
>> Perfect.
>> So, you can see here we're
initializing that and then,
we've got this like,
that's a fairly
long like while, this is running.
We're going to go through there
and we're going to create
a document change feed query.
This is basically
going to go and grab
all documents that we had,
and the main things we're
passing on this options object,
are a max item count,
and we're passing in something,
which is keeping track
of our our cursor.
So, you can imagine where
we are in the change feed.
>> Sure.
>> We want to keep track of that.
>> Sure.
>> As we're going along
to make sure that
we don't repeat ourselves.
>> Sure.
>> So, we go down here and then
the fun piece ends up being where
we then submit this to SignalR.
So, we'll end up doing, as
we find that where there's
actually changes that we
do want to send across.
We can then go ahead and just do
clients all send changes,
and we're just serializing
the objects and passing
back and that's all.
>> Then, this is where everyone
listening gets notification,
and then draws the pixels
on your screen as well,
so exactly. Got it.
>> This is another interesting
case where we had to
play with different batch sizes.
When we're sending
results to Cosmos DB,
we were sending a batch basically
each stroke or
each single upsert.
But, in this case, you're
going to be having lots of
different updates
happening all at
once from lots of
different clients,
or you could be having
multiple people
trying to do strokes at once.
>> Sure.
>> So, we have lots
of different updates
trying to go to a couple
of different clients.
>> Okay.
>> SignalR is optimized for
a certain size of message.
>> Okay.
>> So, we do some,
a lot of the work that
you can see here is,
us optimizing for what is
the size of each
of those payloads,
that we then send across
the wire to make sure that
we're keeping the pipe optimal,
but also full and
trying to not have
too much latency
that happens once
things started to
get really busy.
>> Got it.
>> So, you do have to do
some massaging of data.
You can't just send
large messages across and
expect that they're going
to happen at low latency.
>> Got it.
>> You want to make sure
that you still keep things
to the optimal size for signal.
>> Perfect.
>> But, yeah, that's pretty much
all that happens
is just this thing
runs on a web app
that we have running,
and it listens for events and
then sends them to clients,
and it just works.
>> It's very cool.
>> Then, the other one
was where we actually
process them to go
into the snapshot.
>> Yeah, so, the trick
here is that we
wanted to have
the stored snapshot,
stored into blob storage,
and when we're running on
PX drawTo azure.com like the
production one
that's deployed into
multiple regions all
around the world.
We wanted that to
be stored in a CDN.
So, that way, if we were getting
hit by 20,000 requests a second,
that it didn't kill
our blob storage account.
That wouldn't be ideal, right?
>> Absolutely, yes, and
the CDN is perfect,
that's literally what it is for.
>> Exactly.
>> Right, so being able
to utilize that to cache
the snapshot so that
everyone can join in,
and it feel like it's
like sit near real-time.
>> Yeah.
>> So, we used Azure CDN,
we were able to set their
refresh setting to 30 seconds.
So, every 30 seconds
it'll go through there,
and refreshes cache of the blob.
That means that that blob is
at most 30 seconds behind,
what the state of the world is.
>> Got it.
>> So, if someone's
coming in there,
there might be
a little bit of a jump.
You can always reduce
that latency down to
like five seconds or
something like that.
It's really up to you.
>> Sure.
>> But, then the change
pre-processor ends up being
relatively simple for what
you have to do because,
it's just a matter
of reading from
that changed feed which we get
a lot of that code for
free from functions.
>> Right.
>> Then, we just have
to figure out how to
pack it back into the blob,
and then update the blob.
>> Perfect.
>> So, for that, we actually
have this update processor,
and you can see here
we've got this functions.
This one is going to
go through there.
The interesting case here
is when you're using
the Cosmos DB trigger,
what you want to do is,
you want to use is I read-only
list for the document.
If you go ahead and
just use document doc.
You're going to get
a single document at a time.
Once again batching ends
up helping you a lot.
>> Sure.
>> There's overhead for
each individual execution.
If you can process a batch,
you're going get
a lot more throughput.
>> Perfect.
>> So, from here we
go through there,
make sure that the docs
wasn't zero for
some particular reason,
and then we go ahead and
unload all those things,
and then we have
this pixel class that we wrote,
that could go ahead and
take all those pixels,
kind of unpack all
their different x-y coordinates
somewhat the color was,
and then it actually repacks
into a binary format.
>> Okay.
>> So that binary format
is basically we have
a giant fixed length
binary string,
where each pixel is
represented by four bits.
>> Okay.
>> That allows you to choose
from 15 different colors.
>> Got you.
>> That's how we're able to have
all the different colors there.
>> Yes, that's much
like your own little
bitmap format that you created.
>> That's exactly what it is.
>> Got it.
>> That allows us to basically
send something which is
also densed over the wire.
>> Sure, absolutely.
>> We didn't want to send
this as a million pixels.
That's a thousand
by thousand pixels.
We didn't want to send a million,
I mean it would be like
three million properties
instead of a [inaudible]
>> Exactly, no one
wants to see that.
>> So, this ends up being
a relatively small document,
which means that it loads fast
when you refresh the client.
We can go ahead and
just demo that here live.
When you refresh this, it
refreshes really quickly,
because it's actually loading
a very small document and
then it's just a matter of how
quickly can the
JavaScript render,
that binary object
into the canvas,
and that's mostly latency
that you saw there.
>> Got you. Especially
since it can cache.
>> Yeah, exactly.
>> This is incredible.
Been able to take
the architecture that
you saw earlier,
and then reduce it down to
the features that we
have inside of Azure,
just makes your life
a lot easier,
and been able to take
several of them and
combine them all into Cosmos DB.
So, that you have, again
just that one last thing
you have to deal with.
That one less hop over the wire,
and allowing you to get
the performance that you want.
This really, really cool stuff.
>> For all these pieces, it
took us a week to build this.
We had a front end engineer.
We had someone work on
justice SignalR piece,
because he really wanted
to play with this.
This came out.
Then, I managed to do
all the function pieces,
and we'll come together.
The client ended up
being some of the
hardest pieces because
you have to get
the JavaScript Canvas to
update properly, and
things along those lines.
We plays interesting games
where we would
actually cache the events
that we got from the web sockets.
So, when we got
a blob update come in,
we actually would look
at the LSN on the blob,
and then only replay events
from the LSNs.
You never saw any double
replays on the client.
That took some like
hard learning.
But, that was specific
towards our clients.
>> Exactly.
>> Getting the
actual notifications
ended up being the easiest part.
It was all tweaking
in the clients file.
>> Right, the parts that
Azure provided were the snap.
That was the easy part. It was
the stuff he had to
do on top of that.
>> Yeah.
>> It's tricky.
>> If we do the video
on the client code.
>> I believe it.
>> Well, thanks so much for
coming and showing this.
We're learning all about
creating near real-time
applications using Azure
here on Azure Friday.
