>> All right. Hi everyone, thanks for coming.
My name is Aleksey Sevateyev,
I'm a senior program manager for Azure Cosmos DB.
And today, I'm going to tell you all about it.
I'm going to give you an overview of
various capabilities that the prodcut supports.
Now what is Azure Cosmos DB?
We position it as
a globally distributed multi-model database
of the service by Microsoft.
We support multiple APIs,
that is multiple ways to write and read your data.
We support multiple models,
and we operate on top of Azure infrastructure.
We are ring zero Azure service.
That means, that we are in
every single Azure region, 42 currently.
That's more than Amazon and Google combined.
Whenever the new region gets added,
Cosmos DB will be there by default.
So, underneath the covers,
on top of this Azure infrastructure,
what we've built is called,
ARS, Atom Record Sequence Schema.
This schema essentially holds all the data.
And that data is represented in
multiple data models that we currently support,
which is key-value pair,column-family,
document and graph.
And this models are covered currently
by five APIs that you see here on this screen.
Table API essentially covers key-value pair data model.
It used to be called Azure Table Storage.
Now it's folded into Azure Cosmos DB family,
and Casandra API that covers
column-family data model and
SQL API and Mongo DB API covered document data model.
They store the data in
JSON documents and Gremlin covers the graph data model.
Why this is important,
is because those APIs
are ultimately going to be interoperable.
You will be able to write the data with one
of them and read the data with any other.
Currently, SQL API and Gremlin API and interoperable.
So, you can write the data with SQL and
read it with Gremlin in the form of a graph.
Visualize it for your users.
Now, what we provide
underneath using all this infrastructure,
all the resources that we've built
is Turnkey Global Distribution.
You can come to Azure portal,
deploy your data and within a click of a button,
distributed across the globe.
To all 42 regions If you want or just to one or two,
depending on how many regions you
need to color for your users.
Why this is important is because
users can be anywhere and that in
order for them to get the lowest latency
and the highest throughput,
you need to deploy the data
near wherever your users are, right?
And all those DB providers got capability.
Essentially, all of your data
can be made globally distributed or
it can be simply hosted in one particular region.
We provide elastic scale out of
throughput and storage independently.
And this is important.
Because sometimes, you need
a lot of throughput for example for
notification applications that modify
a lot of users at once.
But they don't need to have a lot of storage.
So, you can have that here.
Or you can have an application
that requires to store a lot of data like
archival applications or
document management applications but they barely
have any users and
you don't want to pay for a lot of throughput therefore.
So, you can have that too here.
Some of competition actually does not provide
this capability because they simply deploy
the data in the visual machines.
And therefore, you are limited to
the configuration of actual virtual machine.
If you have some throughput, you
will have some storage too.
If you have some storage, you will have some throughput,
and you pay for all of that.
In Cosmos DB, those two things are
completely independent of each other.
We provide you guarantee low latency of
less than 10 millisecond for
a one kilobyte document read.
And less than 15 milliseconds
for one kilobyte document write.
Therefore, if you increase the size of the documents,
the latencies will grow.
But this is the benchmark that we actually subscribe for,
and we give you the SLA on
this latencies out of any of the regions that we support.
We also have other SLAs.
For example, there's SLA on consistency.
We provide you five well-defined consistency models.
it's three more than Typical,
Eventual, and Strong consistency.
In fact, what we found is that users
typically choose Session consistency
because it's the easiest one to use.
Session consistency is similar to strong but
within the confines of a Single session.
If you have a data access layer
that connects to the database,
it will be a single Session.
If you have a device that connects
with Cosmos DB it will be a single Session.
So, it actually makes sense to provide
Session consistency in most cases and not Strong,
because Strong is super expensive.
It needs to replicate the data
before the data becomes available for reads.
The other two that we support are Bounded-stateless.
It's something in between Strong and Session.
Essentially it gives you a strong consistency but within
a certain period of time or
a certain number of updates of the data.
So, certain interval of time defined by you or
a certain number of updates or prefixes as
we call them defined by you on the portal.
Also, we support Consistent prefix consistency model
which is essentially similar to Eventual,
but at least it gives you the ability
to write and read the data in predefined order.
Your reads will never be
out of order compared to your writes.
With Eventual, anything goes and you
can read the data in any particular order
whenever it becomes available.
Now, we also provide
SLAs in addition to the ones that I have mentioned
on high-availability which is
four-nines of availability out
of a single region five-nines
if you actually increase the number of regions
with the second region were you're able
to provide five-nines of availability.
That's literally just a few minutes of downtime
within a year that we allow the service to be down.
If it goes above that,
then you can simply demand your money back.
And we actually do return the money
back in case any of these SLAs are broken.
And also, there's also an SLA on throughput.
Throughput is something that you pay for.
Therefore, we do provide you SLA
that within the provision throughput that you have,
you will definitely have it every single
second when your service operates.
And if it goes below that,
and you see that you are being throttled.
Again, you can demand the money back on that.
Now, we handle any kind of data.
This is no SQL data, database, right?
It's not relational even though we have SQL API,
we allow you to specify your documents or
the format of the documents when you write
them in whatever shape and form.
You can mix and match them in your collections.
You can infer the schema on the read if you like,
but we don't mandate any particular schema.
Therefore, it's super fast to ingest the data.
You can mix and match them
in a small number of collections,
and you can read them as you please.
And you can infer the data on the read.
Now, we also provide very comprehensive security.
We encrypt the data at rest and in flight.
In flight, it's encrypted with TLS 1.2.
At Rest, It's encrypted with
the key that we generate upon creation of your account.
And will soon be able to provide you
the bring your own key capabilities as well,
if you want to encrypt it with your own key.
We also provide a number of certifications,
where PCI keeper compliance [inaudible] Type one and Type two,
European Union Model closes.
It's pretty large and comprehensive set
of compliance certifications that we have.
Now, let me give you one example.
This is an example of the architecture
that Next Games and
Finnish gaming company uses for the game called,
"Walking Dead, no man's land",
and Cosmo DB here is used as
a single database for a bunch
of application servers that they run.
Web servers essentially, that
they run in different regions.
It automatically load balances itself.
You only have to deploy it once.
If you want to increase the number of regions,
you can do that, but you don't have to.
Essentially, your clients that
read the data from Cosmos DB will
determine what is the nearest location
for the regions where your data resides, right?
And Azure Traffic Manager here is
only shown because there needs to
be some sort of
traffic management happening
before the application service.
Application servers do not have to be separated from
the database by some sort of
a load balancer or traffic manager,
Cosmos DB takes care of that.
Another pattern that Next Game uses is
the ability to store the data that is super important.
The user data, the main game
data set in Cosmos DB while other things like textures,
videos, images are stored in a very cheap blob storage.
And that can be stored through Azure CDN.
Cosmos DB doesn't need its own CDN.
It's actually a CDN of its own for the data, right?
And then you can use the integration
with HDinsight which is
Microsoft's implementation of Hadoop.
It's a Hortonwork's distribution
that Microsoft adopted for Azure.
And you can also use Azure functions to
create your own sever-less applications
based on your database.
So, whenever something changes in the database,
it can trigger Azure function
in which you can describe your applications logic.
So, you don't have to deploy additional VMs.
We have the integration for Azure functions available.
We also have integration with Spark available as well,
if you want to distribute
your application of logic through
Spark worker nodes that can reside next to Cosmos DB.
So in this particular example,
they use Azure functions to call
the Azure notification hub to
notify the users that something happened in the game.
One of their friends signed up or they got signed
off or something happened with the game itself.
For example, certain event
happened within the game and they get
notified if they don't necessarily watch the game screen.
So, Azure Notification Hub provides
you integration with IOS and Android
push mechanisms That is all I have.
Any questions? No? All good?
Thanks for coming. Thank you very much.
