>> Hey friends. Azure Cosmos DB is
a globally distributed
database service
for building highly available,
scalable, and fast applications.
Developers who use it obviously
care a lot about maximizing
its performance.
Deborah Chen is here to share
some strategies to help you
get the best performance
from Azure Cosmos DB for reads
and writes today on Azure Friday.
>> [MUSIC].
>> Hey friends, I'm Scott Hanselman
and it's Azure Friday.
I'm here with my friend Deborah who's
going to teach me how to make
my Cosmos DB databases better.
>> Hey Scott, so happy to be here.
>> Absolutely. So people bump
into some different problems
when they're using Cosmos.
Maybe these are problems
that they can optimize
and fix and you've got
some demos for me.
>> Yeah, exactly. So
what I've taken in
this demo is the most common
issues people run into.
We'll demo what it looks
like when you run into them,
and how you actually
fix and address them.
>> Cool.
>> All right.
>> So the first issue people
run into when they're doing
operations against Cosmos DB
especially queries,
is they run a query and it
takes longer than they expect.
So like what's going
on? How do we fix it?
So for example, here I have
some sample data that
just shows how the
bunch of users who have reviewed
a bunch of things on
a retail Website.
So a ton of JSON data here,
and let's say I want
to write a query that
says get me all the reviews
for a certain user.
I have a query but when I actually
run it using our SDK
and Visual Studio,
I see it's taking
longer than expected.
So we'll run this one.
I was supposed to
print out the result
and say how long it took because
I set a stopwatch on it,
but you can see it's
taking quite a long time.
>> It hasn't finished yet?
>> Hasn't finished yet.
>> That's not good at all.
>> Yeah. So what's going on here?
So finally it just finished,
with time that it took around
14 seconds to return
around 800 results.
That's pretty slow for a query.
So the first thing you want to
check when you run into this case is,
did you set your SDK to
read and write from
the right Cosmos DB region?
So here I have Cosmos DB set
off in three different regions.
West US 2, Southeast Asia,
and East US 2.
>> Cool.
>> So let's make sure that I have
configured my app correctly.
I'll tell you right now this VM
is running in West US 2.
So I go back to Visual Studio,
I check out the initialization for
my Cosmos client. It's up here.
You set this application region,
and we see you've actually
set it to East US 2.
So now that explains a latency,
because we're basically
going back and
forth across the side of the country.
So if I comment this out.
>> That's something
you should let Cosmos
worry about not your app, right?
>> Yes. Exactly. As
long as you tell us
where your app is, we'll
take care of the rest.
>> Cool.
>> All right. So now let's rerun
this using the correct region.
Forgot to close the old one.
Now that have finished
pretty quickly.
>> It was like two seconds.
>> Two seconds. So
that's actually one.
Make sure you're reading and
writing to the correct region.
The other important thing to
note here is pretty basic
but if we're doing any type
of protesting with Cosmos DB,
make sure you are doing it in
Azure and Azure environment.
For doing on your local dev blocks,
all your testing is
a distance between
you and your closest Azure data.
>> You're limited
only by the speed of
light. Sometimes not even that.
>> Yeah. All right. So
that's issue number one.
The next issue we run
into is when people see
their queries consume
more or use than they expect,
and they've run to
rate-limiting or throttling.
So you're familiar
with how the Cosmos DB
provision throughput model works?
>> I know that there's a unit,
like there's this RU and
it's an abstract thing.
I'm sure it's concrete for
most people but it's
abstract in my mind,
and there's a slider bar attached to
my credit card and
the more RUs I have,
the more I have.
>> Exactly. So it is
an abstract unit.
The way to think about
throughput is you tell us
how many operations
per second you need
expressed in terms of
throughput per second,
and that's what Cosmos DB gives you.
So each operation read,
write query takes
a certain amount of RUs.
This query in particular
took a 123 RUs.
That's actually pretty high.
So if we provisioned let's
say 1000 RUs per second
for this collection,
I can run about seven of
these queries a second.
>> Yeah. I want more,
way more than that.
>> If you're going over that limit,
that what you've provisioned,
you're going to see
rate-limiting which
or throttling which will be
able to see in the portal.
So we obviously want to avoid that.
>> Throttling means it'll happen.
It's just not going
to happen right now.
>> Yeah.
It'll just be a bit slower which
is of course not what you want.
So now let's do blog, why this query
is consuming these many RUs?
The first thing you want
to check is what is
your partitioning strategy you've
chosen for your collection.
Cosmos DB what it does is just
tell us how you want us to distribute
your data among all of
the machines that we have
that store the data.
What you want to give us for
read heavy scenarios is you want
to tell us exactly
which machine your data is on,
so we don't have to
check all of them.
The way you do that is
via your partition key.
So in this case
let us first check what we've
partitioned this collection on.
All right. So if I go to my scale
and settings for that collection,
I see I've actually
partitioned on ID.
But if you look at my query,
select star where username
equals something.
So because we are partitioning by ID,
Cosmos DB has no idea which machine
has the username with this.
So let's just check all of them.
If I told Cosmos DB is
partitioning by username,
it will just do the one
machine with all that
userData and skip all the other ones.
>> Is a partition key a kind
of index or is it an index?
>> So a partition key
is not quite an index.
It shows how Cosmos DB chooses
to distribute your data.
So there's a ton of
underlying machines.
You say if I partition by username,
all the data with the same username
or a partition key value
is on the same machine.
So that's how you think about that.
>> Can I partition based on multiple
things or do I only get one?
>> You get one partition key,
but the content of that key
is really up to you.
So we'll see more about
that in a later demo.
>> Great.
>> All right. So let's
check what happens if we
change partitioning
key strategy here.
Instead of partitioning by ID,
which gives us a what we call
a cross partition query.
By the way one analogy I like to
use a lot which I think helps,
it's like imagine you go to
a party or event and you
leave your phone there.
It's a difference between knowing,
this person has my phone,
I'll just go straight to that person,
get it from them, versus
if I don't know who it is,
I have to check every person,
do you have my phone?
It is the cross partitioning case.
So that's how we try
to want to avoid for
read heavy scenarios if
possible. All right.
So I have one container
partitioned by ID,
but I have a second container with
the same exact data
only this time it's
partitioned by username and
it's actually what
we're querying about.
So now let's run the same query.
But we're actually now
running it against
both of these collections.
So you can see the first one,
that was the one we just ran
against V1 consumed 123 RUs.
But the second one against
this V2 partitioned differently,
consumed only 62 RUs.
>> Twice as efficient.
>> Yes. Exactly. So
the same throughput
we've now doubled the amount
of operations we can run.
Just because each one is cheaper.
>> That's really significant,
this "small change" is
a massive change because
you've effectively doubled
the effective throughput If I
was doing a lot of this query.
>> Exactly. So definitely
partition key strategy
is always in the you want to check.
Make sure you've got
a good one there.
You have a lot more content
on choosing a good one,
but that's just like
the quick one-minute.
All right. So the next issue
we see users run into,
is that they use queries when
they really should use point reads.
So I'll explain what those mean.
Queries are good like these when
you're you know you're going to
get tons of results back
or at least more than one.
In contrast on those with a database
you know I've one user profile,
I just want that one piece of data.
Its unique, one document back.
So you could do a query that says,
get me all the user profile
where username is X.
But you can also just
directly tell us,
here's the ID of the document.
Here's the partition key value.
We skip the query engine entirely,
and just give you the document back.
So let's see the impact of this.
It's my third demo here.
So here first I ran the query,
consumed around three RUs.
By the way this is
the returning one document.
>> Nothing at all.
>> Yeah. But the thing is
if you look at this one,
the point read and let me
show you what it actually
looks like in the code here.
So here we do the query and
then we run the point read,
and the point read is just
doing a Read Item Async,
looking by the
partition key and the ID.
So it took one RU to
do the same query,
to get the same result,
versus three and
so the point read for
one query is three.
So this may not seem like a lot,
but if you're running at
scale you'd have thousands or
millions or hundreds of
users on your site at once.
With even 10,000 RUs,
you could run 10,000 of these per
second or you can run
around 3300 of these.
>> One RU, like there's
not fractional RUs.
It's atomic, isn't it?
>> Yeah. So this one
happened to be one,
we do have doubles,
like we could do like 0.5, 0.8.
But this one is very clean.
>> But still that's a nice
clean. It did the least.
>> Yes.
>> If you do nothing, you
can scale infinitely.
>> That's the next way to put it.
>> Thanks.
>> All right. So that's
another nice optimization you
can do. So I have a couple more.
I will mostly cover these
for read heavy scenarios,
we're doing a lot of
reads and queries,
but of course the Database
you also write data
to it as well. Otherwise
what is a Database.
So the next most common issue
we see is people say,
I've provisioned a lot of throughput
because I know I'm going
to write a ton of data.
So here I have a collection
here where I've actually partition
50,000 RUs per second,
which is pretty hefty.
The reason I'm doing
that is because I
know I have a Website I'm going
to see what everyone is doing
on each of my product pages,
and everytime user clicks
or it looks at it,
it's going to generate
a lot of volume of data.
So the issue they run in to is
they provision this throughput.
But then when they
actually do the ingests,
they don't get the whole amount.
So it's like what's going on there?
So let me actually
show this in action.
So this is just some sample data.
I'm generating that
mimics what you might be
looking at if you're really
tracking activity on a site.
I'm using our bulk
executor library to
simulate high ingested data
into Cosmos DB.
So I'm running it,
and it's supposed to print out
exactly how many documents
it was able to write,
how long it took, and then
how many effective RUs
per second it was getting.
As you can see this one
is still going,
and you think for 50,000 RUs you
would get that pretty quickly.
So we were able to
get only around 69 or
70,000 RU's per second
effectively for ingests,
even when I provisioned 50,000.
All right. So what's going on here?
>> I don't know. What I'm
thinking about locking in
simultaneity and my brain
is reading all the things
that could be wrong.
Yeah. So Cosmos DB
actually is lock-free.
But what you want to
investigate again is
your partitioning strategy.
What is that here?
So when you have
a right heavy workload,
the most important thing
you care about is having
high cardinality or lots of unique
values for your partition key.
So in this one collection here,
I've partitioned by date.
So if you think about
it, today's date,
everyone on the site
has the same value for
a date and you're all going to be
bottlenecked on the same partition.
So there's a ton of
machines waiting to
serve results but they
all do the same one.
The other ones can't really help you.
So instead what you want to
do is pick a partition key
with high cardinality,
lots of different values.
If you go back to my data,
we see I have this
product page product ID,
as well as the timestamp.
So what I can do is I can actually
create a new property called
partition key where by partition,
too many collections here,
partition key and set it.
So it's just that I concatenate
the page ID with the timestamps.
>> So you're generating a
partition key of your own.
You create one synthetically?
>> Yes. Synthetic function,
that's exactly what we call it.
It's nice because there's a ton
of different unique values
for the product page,
and now timestamp everyday
will be different.
So now let's rerun the same demo,
but let's run it against
the collection partitioned
correctly and see
what throughput we get.
Well, it's actually getting more.
I think that rate-limiting
hasn't hit in yet.
If I were to run moderations, it
would gets slowly [inaudible].
>> Did you only pay it for 50,000?
>> Yeah.
>> So for a moment there
you are living the dream,
and it was almost 90,000
in like a second?
>> Yeah.
>> That's huge, 10X improvements.
>> So same data, all you did was
change the partitioning strategy.
>> Wow, that is making
massive changes in
a very small little change here,
little change there and you
weren't getting 10 percent more,
you were getting double,
triple, 10X, orders of magnitude.
>> Exactly. Those are just some
of the top common issues we
see customers who
are new to Cosmos DB
run into especially in
areas of performance.
Hopefully we now have
some more insight
into when they run into them,
what they can do to debug
and fix those issues.
>> This is super useful.
Thank you so much.
>> Thanks so much.
>> All right. I'm learning all
about how to debug and optimize
some performance issues
with Azure Cosmos DB,
be sure to check out
all the other great
Azure Friday videos that
we have on Cosmos DB.
I'm having such a great time
today, on Azure Friday.
>> [MUSIC]
