Thank you very much. You have helped us becoming
One of the fastest growing database offering
Within microsoft.
I'm assuming you're here because you have issues, right?
Without further ado, let's go in.
First thing, you should
Definitely, definitely whenever you get time, we're
Looking for people across the world, we today -- i think it
Is showing there. Let me see if i can -- let's
Try to get time out of that and let me see if i can just
Do that. Just project this guy.
Then it should be okay. We're definitely looking for
People across the world, across software engineers,
Pms, pretty much anything, please do try.
There is tremendous -- this is a once in a life
Opportunity at this point in time.
Waiting for all the cosmos to join us. With that, let me get to my
Presentation. Most of you are already using
Cosmos, all of the stuff you already know, we're built for
Distribution and we take care of a lot of things for you
Basically no thinking,
Elastic, storage, we try to provide all of this by
Leveraging the azure infrastructure.
The most important thing here,s is the resource
Governance model that's built in across all of these
Resources so that you can get the super guarantees, the
Agency guarantees, the availability guarantees.
With that, let's tackle the most important things.
This is the kind of discussion i usually have.
Let me tell you, it is basically -- it looks like
Cosmos does everything, you are doing availability, you
Are doing the capacity, input for the whole storage and for
Throughout put, what do we do? yes.
There are a few things to be done.
If you have questions, let me know, use the mics, as you
Saw, i was trying to get ready for this session and
There is a lot of stuff, i may not be able to cover
Everything but i'll try to create enough information in
The slide so that you can follow-up later on to if
There are questions, feel free to reach out to one
Email alius you should remember, askcosmos tv at
Microsoft. With that said, what is
That -- we have to monitor or to manage the cosmos tv
Itself. There are very few things.
As we provide the back up and i'll get into the details of
The back up, we'll improve it a lot.
Let me put it that way. We provide availability.
That's the most important piece because the distributed
Data, we leverage the reputation engine and we can
Provide you local availability to reach across multiple regions.
With the multiannouncements, you should see an update on
The rights too. Security, also we provide a
Lot of things out of the box. Yes, there are places for us to improve.
Thank you very much for all of the inputs you provide.
Expect a lot of things by the year end or a little later.
What about this capacity, you're working as a database
Administrator, there are few things that we always think
Of, one is capacity, we plan for it.
Capacity is basically the storage, capacity is also for
The through put in terms of the machine, cpu, amount of
Memory, all of those things. Then the other place that we
Spend time on is throughput, that's ensuring x amount of
Operations can be done without any challenges.
Right. So then finally as part of
The -- as part of the throughput itself, with he
Deal with latency. It is across pretty much all
Of the databases and we also use indexes.
We provide latency guarantees for the operations, for point
Leads and point rights. We also provide good latency
Too and we will delve into a few things to get better
Latency or lesser usage of the throughput itself.
Let me take -- try to make a joke here.
I come to cosmos tv after working for 25 years, this is
A database where you can literally say if you asked
For x number of operations of certain kind we'll get that. That's it.
If you're trying anything else, you'll definitely get
An error message. That is a guarantee from the cosmos tv.
That means that the onus is now more on developers and
Folks together to say what is a real requirement. That's the key part and i
Would hope that you take away from this.
So first part, capacity. I'm jumping directly inside.
What are the few issues i see daily and i'm sure you folks see.
Two, one is the skill of the data, why is the skill there?
Because we're not chosen, a good partition key or you
Expect it to be there because that's how the data will happen.
How do you notice it? you use the metrics.
Generally the right side thing, it is very difficult
To achieve unless it is value workload. You see on the left side,
That distribution, there is a problem. There is a particular key
That's dominating. Let me just
-- The place you
Need to come is the matrics of. Okay.
Ct-sorry. Is that -- okay.
It looks like -- apologies. You plan and you plan and
Then you know the demos. I think they're born to
Humiliate me. That's all.
So what would happen, i don't know how many folks can see
In the back. The key is to go to the
Storage tab. Okay.
Notice a few things, one is that as we already provide
You a data plus index consumed by the top partition keys.
If something like this is there, there is definitely an issue.
You should not have something like this unless you expect
This kind of distribution of data.
Okay. What's happening?
This can happen. Let's take an example.
Suppose you have the data of a country inside cosmos tv.
What should be the partition key?
Let's use an example. Agenda, what is the issue,
Maximum three agendas and only three keys will be
That's not a great partition key. What should be?
Go back to the developer and ask how you will look at this
Data? how will you look at this
Data? it is pretty much all across
O the world, there is a concept of this.
Then you see or i think i should order the data by
This. Similarly you will have other
Kind of options too as you go along. There is one important fact
Here, the amount of data you see, it is right now i have
Stored a huge amount of data here, 1.3Gbe but it is to
Show a few things in here. The data science within a
Partition key range, a partition key, it is equal to
The amount of data plus the index.
The index would be around 20.
It is a little by the depending on the structure
That you have. This is what you should see. Right.
Then what we do also for it is that's also a key, we also
Provide another important piece of information there,
Which is data plus index by partition key range.
Remember, that's partition key, i'll come to the
Partition key range and explain that a little by the
Later on. Later
On. Partition key, here you see a
Particular partition key range, it has more data.
Looks like one of them, one of the partition keys has a
Lot more information compared to everybody else.
This information would not even show up actually.
This is the place that you should spend time on first to
Find out if there is an issue with the petition key.
A small thing here. You can't automatically change the key.
You have to redo the whole partition key. This is something that you see.
How can you make your life easier? you can create another
Collection, you can use azure function or the executor tool announced recently to move
The data. That's an easier way to do
It. With that, then there is a
Few things that i wanted to show you if i can get that.
One is -- one is that you can get a little bit more
Information from cosmos tv. There is something -- there
Are -- there are a lot of options within sdk and i'm
Sure many of you who have come from -- like me from
Earlier.Net and it is a very different model.
There is the best option and there is options and so on,
So forth. The key part here is to
Understand is that there are few things that will be
Animated, the results and few things that you work against.
Anything which can be worked against will have generally
The request option and anything which will return
This will have a different way to apply.
That's a simple way to understand. Here we're basically trying
To get information about give me information about you.
There are a few things, we'll get partition key ranges.
Now, what are partition key ranges and everything, we'll
Get a little bit, you can get all of the partition key
Ranges, you don't have to take dependency on them.
You shouldn't take dependency on them.
In case you want to understand how this whole thing is working, you can get
That information. You want to get a few more things.
One is once the information is provided you should be
Able to populate this out. Let us see if i can run this
Ct-somewhere. Let me get.
You can try this later on. All of these examples, i have
Cheated and taken all of the examples from the github.
There is nothing dramatically different from what we have.
You should know these things. There is one small tip though.
You see document count. Many folks come to us, they
Say can i immediately get a document, a number of
Documents within a collection or a container.
Yes, there is a way. You can get the document count.
This is not by any filter, you cannot filter it by any
Attribute, it is just the raw document o count or entity
Count is what's available. You should also get partition
Key ranges because you can get that information through
The api. Then the collection of
Metadata, why this is interesting for someone, you
Are developing a tool which basically is doing deployment
Of procedures and those things and you can utilize
Ct-this. You look at the partition
Steps in case you're interested, you should search
For cosmos tvgithub samples from the search engines, go
For the.Net and pretty much everything we have is stored
Here. There is nothing that's not
On-line. We can get all of the
Information about how is each partition key range populated
And so on and so forth. Stop me if you think this is
Boring or something, nod your heads although you may be
Sleeping just in case if it is interesting.
What is this partitioning thing? this allows us to scale up.
We have to get that right. What happens, you allocate
The throughput and you create the partition key ranges
Underneath to provide you the throughput is the key
Concept. As the data gross, this
Partition key ranges gets them going, that's what the elasticity is all about.
When the data comes in, with a particular partition key,
We find out which partition key range it can go and set
In and we just push in there. As more and more data for
That particular partition key come in, all of them go in a
Particular key range. If the partition key changes,
Chances are it may sit on the same partition key range or
Anything else. Okay.
You hash it and you push it down.
The question will be what happens if once the partition
Key ranges emerges itself within this limit,
Interesting thing, we automatically split the
Particular key range and you move the data around for each
Key, each key has to move. The availability performance,
Everything goes on, all of the operations keep on
Happening. Now, the key part here to
Remember this, it is nothing that you need to do for
Storage management as long as you get the partition key right.
This is very, very dramatic, right. Of course, now customers are
Coming in, telling us, you should actually have ability
To fence amount of data that we store.
We don't want to store an infinite amount.
We're using those capabilities as we go along.
Practically we don't have the limits, as long as we have
The azure behind us, we can store as much data as you want.
That's why when we say we can scale out, that's the key
Part. There is one thing to leave
You for the partition key piece.
You see many customers use the collection or the
Container without a partition key.
Please don't do that unless you want to increase the
Sides later on. What's the way out, the collection?
You push all of the data inside.
Okay. There are four things, right.
We want to delete. We have the bulk executor.
Use that to delete all of the data.
In case your application allows the usage of this, the
Only thing is, this does not cost you anything, any areas.
I prefer you use the bulk executor but you can use
These too. Sometimes what happens is the
System, the way you store the data and the way you're
Putting it in a particular petition key and by other
Attributes, it is not enough for you.
You want to collect the data in a little different way.
You store it in a little different way which is
Completely possible. What you can do is create
Another collection, have azure function which is
Triggered whenever the data is pushed in the collection
Of another partition key, completely possible. Okay.
What about if there are large documents. This sure something that we
Keep getting from customers from time to time.
If you have a limit today of two megabytes and that's for
Good reason, if you have anything more, you should
Definitely store it first of all, store it in a blog,
Store all of it, you have all
Of the data, if it is a 2 megabyte, all of the
Attributes will be credited, do not index them.
Those are the few things you can do.
There are customers that basically store blobs and
They stitch them together. That's how it is completely
Possible, i wouldn't advise unless you really acquired
More distribution so that's the key part.
When you have a document, what's impact for you, it
Impacts when you read it, write it, right, you don't
Want to do that unless it is really required.
What about if you have partition key sizes which are
More than 10gb. This is very, very rich and
Possible. We work with azure and we
Work with applications similar to azure and it
Stores the data within cosmos tv for profile and for the
Log-in events. Any time across the world
Anybody logs in, all 10 million plus of azure
Ready, the data is stored in cosmos tv.
That's the kind of scale that we provide at this point in
Time. There are similar work loads
That push the data to that scale.
Now, what's that got to do with this discussion?
If we think of this, there's a talent like me, just with
Two identities, three
Identities at most at home. You have 100,000 plus people.
So partition key of 10 is not going to suffice.
What do you do? you add to that.
That's how they scale up. So you can store more than
10Gb of data per partition key as long as you know that
You can add to that. This is very successful for
The majority of the customers that have the issues and
You're always leveraging. There is what one would say
Some for very large customers, in terms of
Latency, you think one tenant really cannot coexist with
Other folks, it is okay to have the collection for them.
You do want them to -- let's look at this with everything else.
Why do you do that? suppose they're very heavy in
Terms of throughput. It is a good reason for them
To move on. Now, let's go to the
Performance part, keep me honest on the timing.
You can see i'm not an artist and definitely not taking
Advantage of the amazing folks who are present at
Microsoft to help me do this a better way.
The key part of cosmos tv is all these folks are the
Documents which are making requests and cosmos
Tv, whenever you make the request
To cosmos tv, any kind of request, and this is where we
Have the database people come in.
We always tell you how much of throughput you have used
For your operation which is a request charge.
We always if it is an operation, we also give you
How was this clearly executed? where do we spend time?
Which partition. How was it -- you have the
Indexes that are not, so on, so forth.
What is the alert part of it. You can create alerts to
Monitor this. You can use azure monitoring
In the future. We have started to expose a
Few things in the azure monitoring.
Ideally you store all of this in the azure inside because
You get this when you invoke the command, you always get
This information unless you chose not to.
Okay. What is the analytics?
How many people have used -- everybody loves this? yeah?
Insights too? yeah. We provide the ability to
Push all this data in this format to the log in the
Index. Why do we do that?
Because in analytics, as i'll try to show later, you can
Actually make interesting queries to find out about your system.
If you understand this itself, i think -- how many
People are actually storing the charge back?
I think people are getting tired of me talking i guess.
Storing -- i will walk you that this. It is extremely simple.
Hopefully after this, you will try it.
What are the issues? to the throughput issues, if
You remember, you're looking at throughput as a collection
Level and the throughput gets equally distributed across
All of the partition key ranges.
What can happen? what can happen, it is
Basically set of partition keys can be hard resulting in
A particular partition key range or a set of partition
Key range becoming hard. That indicates someone like
Me is not really honest with you folks.
That means he needs to basically understand what --
What was the plan and why did he end up like this.
What can happen? there are two major reasons.
One is your provision through it throughput and you're
Executing in operation which is taking all of these areas.
Remember, this is -- otherwise, a particular
Partition key range or partition key itself, it is
Getting bombarded. And this happens many times
When people do load testing. What happens, there is not
Enough variation in load testing.
Inevitably, only a few keys are hit either for the query
Or for the workload. Because of that, the
Partition key is hard and you start getting the limitation.
The other thing is, which is very valid, just the volume
Of operations increases. That's a valid way of happening.
How can you find it out. One is, of course, if you set
Up alerts, hopefully you do set up alerts, we'll find
That out immediately. Second, you can try to get
All the alerts in a programmatic way if you want
To do that, there is a way to do that too.
There is log analytics, let's tell you about the issue
About log analytics later and we'll push all of the data to insight.
This is a i with a to do that.
-- This is a way to do that. Now we'll get to this side of
The text. Let's get rid of this and
Come here. This is pretty much always
This particular tab here. Right. So what would happen here,
You would see the the throughput across all of the
Databases, across all of the collections and across all of
The regions. Normally what i try to do is
Get a big picture view by clicking on 7 days.
Here you can find out, okay, pretty much this is dated on
May 6 or 7. Then there are, it looks like
2K, all requests come in, most are successful and you
Can click these -- thanks by the way -- and you have api
Users, how many? none? actually we do a little
Better job in mobile api, we tell you the successful, non-successful requests and
We tell you the operation it is. Whether it is as read, query,
Anything else. Hopefully soon we'll start
Exposing that here too. There are a few things that
You should try -- let's see if i can -- there are some
Things to look at. One, let me pick up -- this
Is one thing which has changed recently.
That's why it is doing this a little bit.
You want to choose the database and the collection and you want to focus your
Attention on a few things. Of -- let me tackle the laser pointing skills.
You have the partitions and then you have it equally
What do you want to focus on is there any recommendation
Happening? after that, you want to look
At -- the provision, okay.
You have the provision and what was i trying to use.
Okay. Then, of course, when you get
A little bit more information on the particular day and
Time, was there any sku of the throughput, it is one,
Two, three, i think i'm not really good at this.
I remember doing this in -- there is no skill set.
Okay. So what the throughput is,
What i -- it is basically -- you should use that too.
We have seven days of data here.
You have data in log analytics and you can do
Interesting things, by default we provide 7 days of data.
What you could do, if you don't get -- the idea here is
To find out any kind of trend. It looks like it is up and
Down in this case. You can get more details.
You can get more details by clicking on an hourly basis
And going in here. This is pretty much at if you
See 216, 217, it is a rollout data.
If you come from the 7 days or the days, the had 24
Hours, this is rolled up a little bit.
I would suggest coming to the granularity to find out the issue.
Why we're doing this. We're trying to find out
What's this say, max consumed audio per second for a given
Partition key range. Remember, we're trying to
Find out what is the whole idea about cosmos tv, cosmos
Tv, partition is created and on top of that, the partition
Keys are sent up and when you're getting that, it is a
Partition key range which is getting harder to get to the
Point in time. What we're trying to find out
Is how much to put a particular key range, how
Much is using. In this case, you see there
Is an issue. There is a provision, less
Than 400 audios and you're trying to use 4k. What's this?
Let's try to see what is the time here.
It is about 2:14 or something like that.
2:13. Let's see if it is almost there.
Yeah. Looks like 2:13.
Why would we do that? we take that, we enter that.
Ct-just about that.
He noticed me coming in and
Out, he just came in. Thank you very much, frank.
What you want to do here, 2:15, 2:11, and let's see if
We can apply it. What we're really trying to
See if we can get almost the exact amount, this is the
Place where you're trying to make improvements.
Should it be -- i definitely know that yesterday at just
About this point in time i did do some insert.
I think that will definitely show
105. That means there is
Definitely something wrong. Let's do that.
It is right. Okay.
101. Let me try one more time.
This is about 2:16, 2:17 p.M. It was 9:00.
Let's try this. 2:10.
That's not it. Ideally, what can happen --
Sorry? that's yesterday's.
Right. Let me pick up the
-- That is does she just about 9:00,
Right? may 8th, 2, 3, yeah.
Let me just put this 8. Let's try to get about the right time.
This is 8:17. This is some of these things,
8:17, Let's see, it is just
Around that time that we'll get -- what i'm trying to
Show is basically this, right.
You can see that there are some partition key ranges
Which are getting hit at that point in time.
That's what would happen. If you sigh a sku, one, two,
Three kind of thing, there is an issue there.
You really need to see why that's happening.
That's the key part of this particular issue.
Generally -- i'll try one last time.
I'll try one last time. Something has happened here,
It is 1:00 p.M., That should be around
-- Oops. Okay. I'll give up on that part
Right now to get the exact timing. What you will see, it is just
Like the storage, the throughput storage sku, you
Will see that sku. In this case, i will try to
Show you why it would happen. In this case there are two,
Three places. One is -- let's go here.
I have the input running here. You will see -- i'm trying to
Simulate this, this is what i did earlier.
This is something that's available.
We are just trying to push data for one part putting in
The partition key. That's all.
In this case, we'll continuously see this
Happening and you can go back and verify it on the screen itself.
It takes a minute or so for that data to appear back on
The metrics itself. This is when you're trying to
Do bulk loading or any processes.
This is one of the reasons when you're trying to do bulk
Operations and you're not provisioned enough for the
Sku that's created. What can you do?
You do bulk load and shuffle the data and push it.
A lot of people use spark with cosmos tv and shuffle
The data and push it in. A lot of people use this
Because it is there for some time, we have released that
The day before yesterday, you shuffle that in and push it in.
That way you're not pushing that particular partition.
There is something else, in real life, what may happen,
You may have a particular hard petition key.
What do you do? in that case you have to see
If it is just an interruption or something seasonal.
How do you fix that part? if you think that it is a
Reduction for some time, that's perfectly okay and you can increase that throughput
And come back. If it is seasonal, right,
What you should do, you take that hard data out,   utah in separate collection and let
It have its own throughput. Otherwise you would increase
Throughput too much. There is a lot of questions
About ability to scale up and scale down.
I'll comment as we go along. That's the first part of it.
Then there is the key part, the way to remember, it is in
Cosmos tv, the 1kb of data
Will take one area. Okay.
Regardless of anything else. The rights are fully,
Automatically indexed, data, it takes 5 to 6 areas,
Always. So what i'm going to show you
Here, basically, it is a pretty large document which has been pushed in and i'll
Show you a bit more later on. There is a possibility,
Something like this, right, i'm pushing 1mb, something, 2
It mb of data inside. You will end up using a lot
Of value. You should notice these
Things. Of course, you can do a few
More things and let's look at just that piece of code here.
What you will do, the reading, it is
Straightforward. Right.
Whenever you do read the document by the id or
Anything, any operation, you always return the response
Information, you should always, always get that
Charge and store it, it gives you the baseline for one read
Or one write or anything like that. This is the same case if you
Do a write 2. Right. Let's see.
This is the same thing here. You would always do it right.
You get the charge of that. Always. This guess back to the original discussion.
If you know the system well, there is no more art.
It is pretty much nearly the science that we're trying to
Do here. It has to be now, it requires
A little more rigger on your part to push people to
Provide exact requirements. There is no more in here
Anymore, when we have all that we used to do with any
Kind of system, what kind of system should we have where
There is a benchmark. That benchmark, it is not my
Application. Okay.
What do i do? let me look at the amount and
That amount is based on the experience we all have.
If you're a developer, a sequence system, what you do,
You basically find out and allocate x amount for
Operations, here you don't have to do anything.
You tell us the number of operations and provision
Deck. Looks like you have to rush
Threw things. Hopefully that part is clear.
There is storage, there is throughput, if you get these
Two pieces right, pretty much everything is
Straightforward. What do we have on this side?
The next is a query. Query also returns the request charge.
How do you find out? you always pass in in the
Field option because this is the data, you have the proper matrics.
We have existing documentation for this,
Search for cosmos tv, execution statistics, you hit
Upon the firstling in either of the search engine.
In query, what you need to remember is the amount of
Data that you read, because of a query, and the amount of
Data that you actually output, that's the key part.
Now, with that, let's look at the cosmos today.
Please tell me if you're already seeing this, i want
To make sure we utilize your time well.
Let's go through to throughput.
Let's close this. Let me close this.
Okay. Looks like i still need
-- A
Few things. What i'm going to show you is
It is executing a query, and if we have time, and if folks
Are interested, we can walk you through all of the
Queries that we have and that will take a few minutes --
Not a few minutes, it would take some time.
What i want you to notice here, basically the time
Elapsed. What is all this stuff?
One that will go and execute here.
First thing is, we have this by partition, how much time
Is spent on each partition. How much -- how many times we
Try to queue the job to the partition and get the data.
That's basically given by this scaling metrics.
When did we start, when did we end? how many documents did we
Come back? one document.
Why did we retry? retry can happen because you
May have. Right.
What do you need to focus on? is there a request charge,
What do you need to focus on? what do we need to focus on
Index look up time and the index utilization.
This is not completely correct today. It provides a good index
Utilization, it is going to improve in the future.
A key part, number of documents i out put versus
The number of documents i retreat. That's the key part.
If there is a big difference there, you will see your
Audios will get hit. You may say hey, this looks
All text, yes, there is a way to basically read this and
Store it too. You should do that too.
Why there are so many of them, we're executing
A bunch of these queries, how we're
Executing, we're basically getting two pieces of information.
The query by itself, it is nothing much. Right?
We're just getting everything.
What is the key part here. I'll come to this max item
One. This has a big impact on what
You're seeing right now. It is just giving this particular option which is
What is populating this metrics. There is no impact on providing this option.
You should always do that. There is no performance
Impact for generating this information.
Okay. Once you have this information, it is
Straightforward, i'm dumping it, i'm not doing anything.
You can get various responses from each of the partitions
And i'm just adding it up and dumping it, all of this
Information is in github, if you don't get it, i'll
Forward it to you. It is straightforward.
That's that part. If we change this a bit and i
Think i'll run out of time. I'll skip this part.
If you want, you should try it a little later.
Key part, depending on the query, right, if you folks
Can help me out here, which query here would be i just
Land the top ten, what do you think it is? it used a bit of scan, a bit
Of index. What about the query.
A query contains upper scan, right, we don't change that.
So that means if be you have an application which is
Contained and we do the support contains and it was
Defined and contain, it is deaf live a search
Application, you should use azure search, you can -- we
Have a direct integration with them. If you provide us the id and
The partition key or attributes and the partition
Key, you know exactly where to get it and you will see
The difference in the query execution metrics for all of
These. If you -- the way it happens,
You have the query, the query can -- the sdk can analyze
That execution across various impartitions and get back the data.
There is a way to control that, that's what we will
Look at in a bit. Okay.
So there is one optimization we don't do today, that's
Something that you should definitely do.
It is basically use sub queries.
Well, the idea here is that you are pushing it the
Condition into the closet itself. This you will see dramatic
Increase in the usage because we're able to filter out a
Lot of things. This is something that you
Should definitely do. The usual thing that we find
Is that you find what is a performance issue, either an
Operation or a query is taking a long time and the
Query is taking more out of you, why, is it using the
Index or not, you can find the query execution index or
If it is another procedure, using more audio or getting
Heavy latency and we'll tackle that in a bit.
You should log this information and we have talked about how to define
This operation. Right.
I think you were too kind. There is one thing i did not show you.
I think i have to show you this stuff.
What operation were you running, right, and i don't
Know whether that's completely visible, but if
You enable diagnostics there are a few things that will
Happen. One is if you have an alert,
You will get an alert here. How do you do that?
You come to the metrics, basically the monitoring
Piece, and we have alerts here. You can set up the alerts.
Okay. It will tell you when -- we
Just had a couple running. It just got hit a few minutes
Ago. You can pretty much create an
Alert on anything, number of operations you should
Definitely do. The number of operations you
Should definitely do, the number of errors, you should
Definitely do. All of these are here
Actually. You should definitely do the
Request, total request, max area per second and average
Request per second. You want to find out any changes.
Okay. Definitely. Then what should you do after
This. This is done. You should enable the
Diagnostic logs. This diagnostic log, we push
Out the data to event hub or
Log analytics, in this case i have that, you can do the
Same. You push to the storage
Account and the log in analytics and you push all of
The requests as well as the data requests.
I will at a you can about k about this in a  bit now.
Up see the push data, you have 15 minutes to two hours
Today, that's why i keep saying that you use the app
Insights and you do similar policies there.
So if there is a requirement from your customer, from your
Side to find out what was happening you can always do
Something like azure activity and find out if there is a
Changing of the key, you have the region, you create a
Collection or creating a database account, all of those things basically from
Here. In this case, you don't see
That, it is not switched on, you can get that information.
This is a log. Okay.
The important event holder here for us is the azure
Diagnostics, what it has, it has all of the interesting, interesting information and
If i can get that, you get information like that was a
Change that happened at that point in time and it took so
Much time, so on, so forth, that's not so interesting.
What would be interesting for you and me, basically start
Looking at how many different operations that are happening on the back end.
This is all documented by the way.
There are deletes, queries, pretty much anything.
Now hopefully you'll get interested.
What you would like to do, more than that, you would
Like to find out who is running more than the
Millisecond operations, there is a bunch of these.
You notice that the last was 0.
We have that for the privacy reasons. What you really want to know,
Let us say, this is an interesting query and it will
Take time for me to render and we can do far better
Graphs than this, you should definitely do that.
All of these queries, if you need them, they're mostly on
The website, if you don't get them, send me an email and
Ct-i'll forward these. What you would like to do is
Are there operations that took more than x amount of
Areas, more than x amount of seconds, right.
Then you will start to see interesting pieces of information.
You see 50 items per page, there is a read and so on, so forth.
What if you want to query these, right? you can do that too.
You would say that i only want queries, please give me
Those. You can get the queries, what
Do you do? you get pretty much
Everything here. You get the amount of time it
Took, where did it come from, what was the length of the
Response itself and a whole set of things.
The question is going what is the query text and today we
Don't log it, that's why i'm saying use the app insights
To log all of the information. In the future, we're planning
To see how best to comply and have that information. Okay.
You should be -- you should at least with a template text
As well. Let's see. Not yet. Done.
This is a key place where you'll find out. You go to the metrics, you
Find out the issue. What is happening, you come
Here, you look at the operation, if query, i'll
Find out the query metrics, which is creating an issue.
If it is right, which one is creating an issue.
There should not be an escape from this.
Okay. With that, i'll switch back
To the presentation. Hopefully it is still
Interesting. Yes? no?
You're learning something or -- you have to stop me and
Then i can do something more interesting!
What is the max item? the max item count, there
Is -- so there are a few things you can control.
While writing, you can control the amount of
Barriers, by switching off the index.
We'll cover that. By doing the right kind of
Query, you can make sure that you are predicting only the things you require and you
Can ensure that you're using less amount of barriers.
What about the client side? right? what you can do, you can --
When you see latency on the side, there is usually two
Reasons, usually the operation itself is taking
Too long or there is limitation, by default, we
Try 9 times, so authentication if it is
Started, if not, maybe we try 9 times, you will see the latency.
What can you also -- what do you also see is if the max
Item count gets changed you will see a dramatic -- if you
Just send it to one, you're only getting one item every
Time for a query result. That's not good.
You should basically set a high number.
You should set the delivery of parallelism for us to
Decide what are the default numbers for these and let us
Try to do that. Don't optimize based on the partition keys or anything,
The ranges, there is enough intelligence to find out the
Lay of land and also get that information and do the right thing.
Use the default and decide how to paralyze and how do
Get back the data. What is the max buffer item count?
That's basically a memory on the client side utilized to
Restore the refreshed item. The max item count as i
Mentioned, if you set is to one, you have a bigger
Number, and you will make more than one trip to get the
Data and you will see latency.
Okay. The recommendation itself,
You can override it. It happens by default and as
I mentioned, we have a sort of default and you can retry
It if you don't like that behavior. Why would you do that?
You don't like the latency impact.
Okay. Now this is something that i
Think i repeat. End of the day, you can
Create the collection of the containers
The input is equally distributed. Key part is, remember, if i
Can make that statement here, right, every scenario, it is
One amount of value. You read and point, it never
Changes regardless of the data. The query changes because you
Have the query with the amount of data that's getting
Possible candidate, it can increase or decrease.
Anything else, it doesn't change.
You have the operation, but if you want to do the bulk of
It, use the bulk executor. You should always, always log
That information just for a few things, right, if you
Have -- how do you do the capacity planning, you have
The five to six areas for right and complete index data.
Let's say we have the provision for 1 hundred,000
And then there is a particular query at stake,
How many can you run? you can run about 2,000 queries a second.
Right? each partition key range can
Go up to 10,000. So when you -- if you want
More, right, you have 1 million, 2 it million,
Contact us. There is no limit.
Just like we don't have the limit for the storage.
We don't have limit for the throughput too.
It is all about how much money you're making.
There are customers running millions of areas a second at
This point of time and we're very thankful to them.
We'll optimize this behavior as we go along.
Capacity planning is basically calculated and you
Have the information, for queries, you definitely add them up.
You add that information and say this is what it is a second.
This is the holy grail saying that you should be able it o
Tell this is what i need, the throughput i need, and i'll
Provision it and standby it. As long as the people who
Have estimated the throughput, the system should
Just work fine. Majority of the customers
That use it, they find that they just keep on using it
More and more. How do you track it?
As i mentioned it, how do you scale it up or down,
Everybody knows this, right? no? yes. >> No?
Everybody knows it? okay.
You can scale it, it is very straightforward, it is
Instantaneous. There is no limit or number
Of times that you can increase or decrease unlike
The other guys on this side. This is not really required.
There is one piece of information and we have also
That's the metadata request. We used to have a way to
Query all of the resources. We have a resource model
Underneath and usually people would do query, they don't do
Query, just use the ui method to get to the resource and cash it.
It will be the fastest way. Okay.
Otherwise you have that information that's also
Recommended. Okay.
What can it do on the client side, you should try to use
Tcp direct, you should call the client and make sure --
Look at the execution statistics and the total
Execution time, find out if there is spending time or if
The client is spending time. You can get easy information,
Right. You should scale out the data
If you have a lot of stuff to push in or take out.
Increase at least in.Net you can increase this right at the level.
There is a rare time that you may need to switch on client
Side metrics, don't switch it on indefinitely, it will
Switch up the desk. The usual thing you found
Out, the customers are using, very low number of cpus, the
Machines, it is not a good idea, the cpus. Okay.
If you have any system, so if you're using this, you go for
Dedicated. Okay. That's one thing that i would
Suggest. People who are leaving,
Please do give the feedback. I appreciate it.
Thank you very much. I can improve next time!
What about index? we auto index all of the
Data. That's a big relief.
Before working on any kind ever application, whether is
No sql, i want this next slide here. Yes.
Within this. You still need to do index
Management for that other host and database.
No sql database or relation database, you have to do the
Index management, for cosmos tv, you do not have to do
Anything. By default, you don't have to
Do anything. Until, of course, you start
To store two megabytes of documents and you have only 2
Of 4 attributes and that time you switch off indexing.
You have the indexing for a simple reason of what i found
Out in the last two, 3 years, it is basically i set it up,
But the person doing this does not know about the
Indexing and he will get completely different results. We index the data when we
Get -- there is no operation on your behalf.
So the data, you have the indexing, if the query
Operations are very heavy too. Okay.
How can you disable the index? straightforward.
There is a sample on the github as well as other
Places. We can be complete, just mark
That as the partition key you switch out the index,
Just mark the index key. I will not spend much time, i
Hope you have understood this in enough of our discussions.
The index, you have the noted index, you have all of the
Data and you treat all of the documents in the form of the
Forest and index and what you need to remember is do not
Mess around with configure if you don't know what you're
Doing, just do the maximum -- whatever the position is
There by default. You do the right thing.
Majority of the use case, use the index, there is no reason
For using the hash index, you scale up because this can
Cover both equality and range queries. This is a longer topic.
If you do think that you have a set of partition keys and
You know them, you move them out, you read them from other
Region, or peek at them, don't try to do a run time
Calculation of all of those things.
What is a good -- if you look at performance, what are the things?
This is the fast, query, then you have the little slower
One, then more areas, so on, so forth.
What is better? the bulk insert or two,
That's always the better. You use the put from the side.
You have this one that's better, you do the delete and
You use the bulk executor, it is the best, it is the fastest.
If your application model allows, you can see this again.
In our -- i didn't get a time into to do this.
This doesn't mean that data will take minutes, hours or
Something, data is instantaneously pushed across.
It is only limited by the speed of light. That means as long the the
Application is allowed to read the eventual data, you look at the throughput and
You will see the double amount of of what you can
Push through. Availability.
I have about 6 minutes. Let's see how fast i can do
This one. We provide the
Region, we provide replica sets, the
Replica, the partition is backed about i three more,
You never see them, never manage them, you don't have
To worry about them, which allows us to do this
Completely indexed and allows us to do updates
Continuously, we're updating as we speak. There is no impact on
Anybody. We give financial guarantees
For all of these. We provide finance for reads,
If you enable, of course, multiple regions, as you have
Seen,s you also provide multimaster and we'll have
These as we go along. This is something that you
Often do, there is no loss of data or availability even
When using this. This is something very
Unique, many of our customers use that.
If you're interested, we are also working with six months
With various regions multiregion, if you remember,
We have five consistencies and if you want four regions,
Three regions, 100 million seconds apart to be
Consistent, strongly consist send us a email.
We're doing a preview at this point in time. What's a thing you need to remember?
You do not have to worry about availability because it
Is automatic and manual and sdk is intelligent enough to
Find out what is the right region, of course you have the right priority.
If you're a distributor across the board, the region,
You have u.S. And suddenly you switch to japan, not a
Great idea, you have the 220 mill seconds, you can do this
Pretty much often, no data loss, no availability loss,
Anything to keep continuing. Sometimes what you may say, i
Want a little bit of performance, you go ahead,
Use tcp, there are unique cases when people -- when
Reads are going to the right they're not set up at the preferred location or not set up in the enabling point of discovery.
The reads, they're not -- they're not checked, they
Enable the fall back option. These are innings this that
You definitely need to check. Today we provide two feedback
Ups, these are snapshots taken at every four
Hours, and they have deleted this by mistake.
Always leverage the globally distributed mechanism.
As you can really only say so much, a lot more is coming.
Please stay tuned on this front.
Always, always expect errors! right!
So the exception, you should capture it, all the reads,
They can be retried, you have to take care, you know your
Application is better. There are some things that
You can take and log it, when you get an error, what about
There is an activity, every time you do the outpatient,
If you hit us, if you face an issue, always use the
Activity to say that's the issue, you have more latency,
Just let us know. All of these status codes are
Documented, you see 429, you see at that level, it is
Taken care of that, 500 is definitely an issue, you
Should definitely reach out to us.
Don't worry about it. Security, i will not be able
To cover everything. I had to rush through.
We provide what i would say, we're going to make this
Whole thing better, but where we are, we input the data and
The rest and motion, there is no feature for you.
It is always that way. We provide the firewall and
We have the service end point. We provide key base
Authentication to rotate and you should rotate and expect
More as you go along. I can see other questions on
That, hold on for a few minutes and i'll come to
Those pieces. All the operations, you
Should use the log in analytics to push that
Information or something else at this point in time.
What if you want to do resource level blocking of
The operations. You should use the token now
But it will improve, this whole feature set is going to
Improve as we go along. There is a lot of the compliance that we comply
With, i'm not taking time to note them down, complete, go
To azure compliance and you will see our name everywhere.
What if you're doing bulk load.
Increase this and shuffle the data and push it.
It does not know the distributed database, it will
Not work quite well. Use the native dialogues that
We have and they'll work out okay.
We already have the bulk executor and it will mitigate
And we have it integrated with this and you can use it directly.
A lot of customers use that. You want a copy of database,
Hopefully when we have the back up we'll be able to do a
Lot more of these things. Today, just change it in the azure function.
Now let's see if i can really finish this.
There's a lot of things that you never, ever get to cover
Over here. Correction policy, tell us
How you're connecting and you can also -- you should
Definitely add the preferred locations.
You have the max connection limit, we talked about that
Earlier. End point discovery is
Important. When you create an account,
It doesn't know about how many regions you are.
Sdk will find out if you set this option how many places
You are located and if you'll do automatic follow-up based
On your sectors. That's the priority in the account and whether it
Matters or not. End request options, any time
Anything has to be reenumerated, push it --
Control it from there. Request options, basically
You have an index, a collection, you're trying to change the offer.
Everything comes through the request options.
Change the options, so you can read from the beginning,
From this place, read from the token, all of those kinds ever things.
Use the tool procedure, use that and we provide the asset
Compared to the last procedures, the batch will
Finish or edit around. Very important thing.
Okay. Don't use it for -- just
Three purposes, there is no point in it.
Okay. There is a small thing there,
Batches, inserts updates, bulk executor, it does all of those things the right way.
I think i'm just about in time.
Right? hopefully you'll forgive me
For just rushing through a few things.
What do you need to do as a person managing the cosmos tv?
We do a lot of things. What we need to make sure is
Get the partition key right. Get the throughput right.
The monitoring of the throughput and latency that
You're getting. Availability management, you
Should definitely have more than one always and ensure
That you log information in log analytics or in your own
App insights and analyze those and then move forward.
With that, i'll just take questions if there are any at
This point in time. Thank you very much for your patience.
I may have rushed through here.
I hope it was useful. Everything should be here,
Mostly, the number, it is the alias, we try to be as fast
To respond on that. Reach out to us.
Thank you very much again for your support as we develop
This application as platform. A lot more exciting things
Through the end of the year. Just the beginning.
