if you need globally distributed highly
responsive database in a cloud this
video might be just right for you this
is Adam and today I'm gonna give you
introduction into Azure cosmos DB so stay
tuned
azure cosmos DB is great it's a
fully managed globally distributed multi
master replication database what this
really means it's designed for high
scale global replication low latency
database but recently it also got some
analytic workloads capabilities using
jupiter notebooks and apache spark so
how does it store data simply said it's
just a document database a document is a
JSON format so it's an open standard for
transferring data and an example of that
kind of data you have on the right hand
side it's a key value pair where for
instance for key you have a first name
and for value you have a John like a name
and also you can not only put simple
values but you can do boolean values
integer decimal points but you can also
nest objects inside so for instance for
address you can have street address city
for phone numbers you can have our eyes
if you need to do whatever suits your
needs so it's most commonly used
document format right now at this point
for programming all the api's so let's
go into overview of the cosmos db itself
what are the key features that you would
want that would make you want to use
this database the first one is global
distribution so this database is
distributed across all the major regions
in Azure across the globe and it's done
transparently so you don't have to worry
about this replication you just press
few buttons and it's done automatically
which brings me to another point the
regional presence is very great it's for
54 regions or more if you count azure
government and azure China additionally it
has high availability for both reads and
writes is actually very very good next
you have elastic scale elastic scale
basically means that
can scale from thousands up to even
hundreds of millions requests per second
of course it's gonna cost you a penny
but it allows for that flexibility and
that scale you have that guarantee of
the low latency so for than 99% of your
requests the data will be returned under
10 milliseconds this is of course based
on what are you querying for but the
latency is very low I even experienced
that when developing my own applications
and it was really fast
next you have consistency options how
would you describe this the easiest way
when replicating data across those
multiple regions the data needs to get
there at some point in time and you can
actually decide what is the way that
this data will get there you can either
go for some sort of strong consistency
which means your data will be arriving
at all regions at the same time at the
cost of the performance and response or
you can do some more eventual type of
consistency so it's going to get there
eventually but you will get the highest
performance and response time and there
is no schema index management what this
means is that this database was never
designed to have schema in mind so you
don't have your normal define the table
define the columns the type of the columns
there is no schema you just operate on
the documents and each document can have
different properties there's only very
few properties that always needs to be
there besides this you are not strictly
defining any kind of schema and
additionally there are no indexes
Microsoft created cosmos that it
automatically indexes all the data you
can override this with something called
indexing policies but out of the box you
don't even have to worry about the
indexing of your data and one of the
most important features of Cosmos DB are
multiple api's when creating Cosmos DB
you are deciding what kind of API do you
want to use when working with your data
the default one the core API is the SQL
one its
way of working with those documents
using very similar to your normal SQL so
you can query your data just like you
would do with any normal database
additionally you have api's for
Cassandra
MongoDB gremlin if you're working with
graphs but also if you're migrating from
table storage to something more
performant you can also use azure table
storage API which allows you to use SDK
for table storage cosmos DB structure is
fairly simple what we're going to be
creating in a second is database account
it's a top-level resource in Azure but
underneath it you can have one or more
databases this is a equivalent of your
SQL database within that database you
have something called container you can
think of container like tables and each
table have rows of data called items so
items are the most important way of
storing data what we seen previously
that document there that was just one
item additionally you can have stored
procedure merge procedure
user-defined functions and many more two
things that I want to highlight here are
first of all containers contain
something depend different depending on
the API so if you're choosing Gremlin
that's going to be graph if you're
choosing SQL that's going to be
collection if you're choosing table
that's gonna be table search and again
depending on what API did you choose the
items collection will be either rows or
documents or nodes or edges so that main
choice of the API is very critical for
designing your cosmos DB critical thing
to understand for cosmos DB are requests
units it's basically how you design
performance because cosmos DB contrary
to other services you don't pick how
many cores how much RAM do you get you
only pick one measure that is your
throughput measure and that's called
request units think to understand about
request units that depending on what
you're doing with your database you pay
more or less request units so if you're
reading a singular
means that one request unit but if
you're inserting data upsetting deleting
data that's more request units depending
on how big the change is and if you're
querying data that's much more as you
can imagine when que rying data doing
some aggregates you are going through
multiple documents in your database
therefore you need to paint more request
units one of the another important
things are partitions normally I
wouldn't talk about partitions because
in databases they're more advanced
topics but here they're one of the most
critical things when designing database
in cosmos DB because you cannot change
them after you created your database you
would have to recreate your containers
in the database and just reupload the
data so there are two things and that
you need to remember here first of all
partitions there are two types of
containers logical and physical logical
are very similar to you how you design
it in your database so you pick what
kind of partition is this maybe we're
gonna split it by city or maybe by an
airport and then they will land
eventually in physical partitions but
the physical partitions you no longer
decide it's cosmos DB engine that
decides if they will land in the same
physical partition on a separate why is
that important because performance of
your query bases on how well did you
design your partitions and lastly the
SQL API I that I already mentioned I just
want to show you how does it look like
it looks as you see very similarly to
your normal SQL still doing your normal
selects you can do joins order bys or
more complex solutions that I'm gonna show
you in a second as a demo and lastly the
last key feature that I wanted to
highlight for cosmos DB is that change
feed change feed is an amazing feature
of Cosmos DB which allows you to
seamlessly connect to your computing
resources in Azure like other functions
or maybe some streaming like our stream
analytics HD inside may be databricks
etc which basically allows you to
react to changes in cosmos DB as they go
in do some computations and go back and
save maybe some results to some
analytics whatever is your need now this
is amazing how easy this because just
few process of the button so right now
we're gonna go into our portal and
prepare our demo so in the portal we're
gonna create a cosmos to be resource
right now so let's go hit create the
resource type cosmos DB let's get our
cosmos DB hit create we need to provide
a resource group so I'm gonna choose the
one that I created previously we need to
provide the account name account name is
very important here because that's gonna
be your public URL to your database so
let's type am cosmos demo that's
available let's choose the API I'm gonna
choose the core SQL because I want to
show you the queries right now there's a
already option seen in the portal for
jupyter notebooks and apache spark this
is this advanced analytics that I was
talking about and next we pick the
region this is the main region that your
cosmos DB will be residing in so we pick
North Europe and I'm gonna choose geo
redundancy has disabled and multi-region
write as disabled because it incurs
additional costs and I don't need
that for the demo so I'm gonna hit
review and create everything seems to be
fine and I'm gonna hit create because
this process takes about four to five
minutes I'm gonna skip ahead right now
so the provisioning finished we can go
to our resource the basic tab that will
be presented in this overview the
overview tab and overview blade actually
gives you most basic information about
the cosmos DB that you created so the
most important on on the status of that
database is currently online the
database has currently only one read
region that's north Europe and one write
region that also North Europe and this
is the URI to your database so if you're
gonna be connecting from any kind of API
or a code to this database you use this
URI you additionally have very cool
tab here called usage you can actually
track your hourly usage of the cosmos DB
so you can easily see what is the cost
and easily estimate what will be the
monthly cost of your database
but the most important blade is data
Explorer this is the blade were you
working with your database entities you
work with your database schema well it
doesn't have a schema but the structure
but also data so the first thing that
you need to do here when creating
database you need to create a container
but when you try to create a container
it will already say that you don't have
any database so you need to first create
a database I'm gonna call it mine
to do so this is gonna be my database
where I'm gonna be storing to do
application information second of all
you need to provision a throughput as we
were saying throughput is your
performance and hourly performance
managed in a request units per second
metric so the slow the lowest that you
can use here is 400 400 already is $23 a
month you can actually see that on the
top here so that's 23 something that are
averages to 24 that's the smallest you
can get and it will very quickly grow if
you're gonna start creating multiple
region databases because you're gonna
pay that throughput for every region
database and also for the replication
next we need to pick the container this
is equivalent of our tables I'm gonna
call it to do list and I need to pick a
partition key a partition key as we said
this is the way for cosmos db to know how
to split logically your data but also
how to split it physically for the
performance I think on my to-do list the
common key that will have on every
single entity that I will put into
database will be category
and I think that's it I don't need to
specify anything else if you have some
unique keys that you want to validate so
this is that lets say the smallest type
of schema management that you can do you
can add that unique key here but I don't
want to do that right now I'm gonna hit
ok and what will be done right now a
database will be created in cosmos this
is that my database that was created and
a collection a single collection to do
items so as you see we got a couple of
things here first of all we got the
items this is the list of the items that
are currently within this collection you
also have settings so you can do some
tuning of your data and how the
collection is being maintained here you
have stored procedure user-defined
functions and triggers we're not
focusing on those three right now
because this is a bit more advanced
topics so let's go back to items in
items tab you will quickly notice that
you have a select star from C which
basically means give me all the items
from this collection we don't have
anything right now so maybe for the demo
let's create two entities you hit on the
new item button and you need to specify
the only property that is required is ID
but in our case since we created also a
partition on the category the category
key will be also required I already
prepared some sample data so let me
paste that in so first one is ID one
it's a personal to-do item and I need to
pick up groceries so I'm gonna hit save
I got one item notice that cosmos DB
additionally add some properties of its
own like e-tag that is used to track the
changes on this item the timestamp that
is encoded and few a couple of things
that are here and let's create a
secondary item so let's hit new item again
and copy/paste the second item this is
another to-do item and right now we have
our items and this is as simple as it
gets if you want managing the
from the portal this is how you do it if
you're managing me from the code we're
gonna see that example in just a second
so how do you query this data since we
pick the SQL core API we can now query
the data to query the data you go here
click on the new SQL query and you just
type in query if you hit this default
one you're gonna return all the items so
see see this is the collection of two
items but what is great in the portal
that you additionally get query stats
query starts is a table telling you how
much requests units did you use for this
particular query so in case listing all
two entities that we have in our
collection cost us two point three one
request unit so a bit more than two
because two was just reading two items
and those point three request units was
just to crawl all the data and go
through combine this together and return
it so you can easily check how much does
your query cost so if you maybe order
this data by some property like a
timestamp since you remember underscored TS
is a timestamp and execute this
query notice that the query cost us a
bit more 2.94 because additional table
has to be done by cosmos DB in order to
return this data and go back the result
and as you see we ordered descending by
the timestamp so what we need to do
right now is go and do another demo with
dotnet I just want to quickly show you
how easy it is in dotnet to also connect
to database like cosmos DB so I'm going
to my empty Visual Studio code where I'm
gonna be developing my application I'm
gonna zoom it a little bit in so let's
start developing now since we have it
zoom in let's go into and create new
console application
net new console this is gonna create an
empty console application which I'm
gonna restore packages to and it's
simple hello world application the next
thing that we need to do we need to add
a cosmos DB package to do so type dotnet
add package Microsoft azure cosmos and
once it's restored we can already start
using it
what is great here is that this package
is right now in version 3.1.1 which
brought a lot of new features that are
available for cosmos and more
streamlined coding syntax so what I'm
gonna do right now is I'm gonna copy the
code that I prepared previously so I'm
gonna paste it here
I'm gonna fix some using control and a
dot using tasks and using Microsoft
Azure cosmos and only thing that is left
is copy this paste here type wait of
course this is not the best way to do
asynchronous programming but is good
enough for our demo so let's see what
this code will do first of all it will
initialize the cosmos DB client using
the URL and a cosmos key
the next it will create database if no
exists I'm gonna maybe create you just
to show if this works I'm gonna call it
demo DB then it will create a container
called my container name if such doesn't
exist with a partition key called
partition key path and a 400 request
unit performance and then it will create
a some dynamic object with an ID
partition key path and some additional
details and lastly it will call
container and create that item so what
we need to provide is the cosmos URL
first of all let's go back to the
overview let's copy the URI of the
database here and we need to copy the
key so we go to the key supplied and
copy primary key
I'm gonna stop here for a second just to
tell you make sure that your cosmos DB
key is safe because that key allows you
to also create additional databases and
change throughput so you can pay a lot
of money if this goes into unauthorized
hands so just make sure that is very
safely stored so I'm gonna hit f5 now
this is the core... core dotnet core
application I'm gonna hit f5 again just
a debug let it run and it finished so if
it did finish correctly we can go back
to our data explorer and to see what
happened hit refresh and see our new
database and new container provisioned
and within that container you can see
there's item that we created during the
demo of course if you had run this code
multiple times it would create multiple
items because IDs generated dynamically
so let's go into last demo let's do into
querying demo so in my to do database
I'm gonna create new container called
families and I'm gonna hit maybe what
kind of partition key we would want to
design since my data will always have
creation date as support as part of the
data I'm gonna pick that as a partition
key I'm gonna hit OK and in my families
I'm gonna create two items with some
additional information so first one is
Wakefield family item I'm gonna hit save
and the second item that we're gonna
create is Andersen family I'm gonna hit
save and right now we have two families
as you see that those documents are bit
more complex than the usual so you
have parents that have some children
information each children have
some information the children have some
pets there are addresses this is a bit
more complex structure than normally and
this is what is actually great about the
cosmos DB so right now let's try some
queries so let's open new SQL query
window and first of all select with
where clause so select a specific family
so this family actually works right so
we selected family where ID was Wakefield
family query stats that was actually
3 requests units so quite expensive
just for simple select and this is
because the ID was is not our partition
so our query is not that efficient the
second one that you see we can actually
maybe join some information about the
children so let's run this query this
cost as 3 request units and the
result is the name of the children in
the Wakefield family next query
that we're gonna run is a query where we
actually create objects so this is a
more powerful than the usual query
because you can actually create objects
so you can create new object with a name
and a city and join the information from
your data so let's execute it and this
is how it looks a family wakefield
family is in the state of New York let's
go to query starts the second 3
request units let's go to the next query
you can also besides objects you can do
arrays so you can list some
information like city states let's
execute this query this returns a city
and states that the cities in write the
Seattle in Washington New York in New
York
so what else that you can do is you can
do aggregates so you can paste this
query to count how many children there
are in all the families so let's execute
this query there 3 in all the
families and again 3 requests units
because it had to go through all the
data that we have SQL in cosmos DB is
quite powerful you can do a lot of stuff
in here although this database is not
super designed to be performant against
query aggregates it recently got updated
so the performance of aggregates is much
much better than previously so this is
currently being rolled out so it might
not be there for your database yet but
if you need to just write to Microsoft
and they will enable it if you're
watching this video probably a month
from the release it's probably already
there so you don't have to do anything
but the query performance for aggregates
got severely improved in the recent
months so that's it for the SQL API
let's just lastly go through some of the
options in the portal first of all you
have replicate data panel this is a
great panel because you just select with
a click of a button that you maybe want
to East us 2 and maybe you want some
southeast asia regions and you hit save
and your data will be automatically
configured to be conf... automatically
replicated in those regions with no
downtime you actually gonna get that
list here so you have additional you
have right region north Europe but you
also have read regions East US and
Southeast Asia while it's updating I'm
gonna go to the next step because next
up is the one that will care about the
most
so this is the consistency tab I did
briefly take talk about this when I was
talking about the cosmos to be in the
beginning so let's go through the
consistency by default you have an able
consistency called session so within
your session even across regions even in
 north Europe East us whenever
you're reading data and writing data it
will always come at the same time but
for other regions in other sessions data
will eventually come
it might come a different time but what
will remain is the order so if the red
comes first the red will go next next
blue blue purple purple a bit less
consistent less loose consistency level
is consistent prefix consistent prefix
says that even within session you don't
care about this the only thing you care
about is the order so if you roll a b c
into database you want ABC in that order
so it's red always blue always purple
but across regions it will eventually
come there and last from the loose
perspective is the eventual is the
fastest consistency level but is that
lets say in designed to be the quickest
but it also loses some of the
consistency right so first of all it
loses the most important one for many
application which is the order so if the
first came read it might come first but
it might not so you need to remember
that the data will eventually get there
but it will get there in any order any
fashion that will be suited to azure 
itself and if you go on the other side
of this consistency slider to the
bounded stateless, staleness, sorry, you
will see the log and then latency so
here you can configure two things that
you want your data to be delivered into
other regions within either 100
milliseconds, sorry 100 operations so if you do
101 updates you already tell to our
route that you want your data to be
available in another region or you
specify that the maximum delay that you
expect your data to be in another region
is 5 seconds and the next one strong
consistency this is the slowest was
still very fast but slowest consistency
level where you simply say that in all
regions both reads and writes you expect
that your data will land
at the same time so as you can imagine
because this needs to be replicated in
real time there's gonna be some delay
and go to the next you have a firewall
firewall is basically an integration
that allows you to secure your cosmos DB
if you have some important data I advise
you to go here, CORS again a HTTP
request header if you're working with
JavaScript single page application you
can use this to connect directly from
JavaScript there's even JavaScript API
to work with Cosmos DB there are key
section of course you can work with your
database here what I like to do after
demos just hit regenerate key so that of
course no one will be able to use that
of course I cannot do it right now
because I'm rescaling database but I
will do it later on you can just hit
this button and regenerate the key to
get a new one one of the two cool
features that we're talking about right
now is add search azure search allows you
to synchronize your search service with
what is contents of your cosmos DB and
allow you to query data through the search so
if you're let's say keeping some
information about the people that is
searchable you can use azure search to
even extend the performance of the
cosmos DB and lastly you have other
function so this integration is one of
the coolest feature that is even talking
about that change feed so you can
quickly just click few buttons here and
whenever there's gonna be a change in
your cosmos DB and other function will
be executed that will pick the changes
from the cosmos DB and allow you to work
on them in real time and some additional
features are even if you go here they
direct you back to the portal and to the
data Explorer so I'm not gonna talk
about those and others are standard ones
like browsing of the containers scaling
one of the cool features about browsing
here is the estimated cost some note is
that you're gonna get that estimate
right now we created a couple of
databases in containers so my estimated
cost is
sixty-four hundreds of thousands of a cent..
of a dollar per hour of course this
will grow significantly when my scaling
finishes but always check this later on
remember you also have that in overview tab
over here so if you need that and I guess
that's it
and that's it I hope you like cosmos DB
and you will find use cases for this
databases in your projects all the
scripts and the data are available in
that section down below if you liked the
video hit thumbs up leave a comment and
if you want to see more videos subscribe
and see you next time
you
