So for the next 20 minutes i'm going to take you on a journey
Through the azure cosmos db. I really want to cover two
Items. The first is this concept of an
No sql database so we can compare and contrast.
I'm sure a lot of people tend to be familiar with relationship
Databases than no sql. Then i'll cover azure cosmos
Db. A cloud developer advocate here
At microsoft. In my 20 years of professional
Development, i've never heard this phrase before.
How we've always done it. I want to level set a little bit.
It would be an oversimply fiction if i said the only
Reason we have relationship databases is because of storage.
That was a huge driver three decades ago.
I'm showing my age a little. Storage was exponentialry more
Expense pitch the database optimizations were all about
Having the least space used as possible. That led to charts like this
With relationships and storing data, referring to it by ids and
Everything that comes with it. It creates this interesting
Paradigm. So you have the person on the
Team who understands how to store data in a normalized
Database and you have the development teams that usually
Working with some sort of domain-driven model or object-oriented development.
What do we need in between? anyone guess what the next
Object i'll throw up is? orm. Perfect.
We've got this orm. It's not a bad thing.
Not here to tell you sql or orms are bad.
A lot of companies will drive their database from the orm out.
It drives the object side, the data side.
I imagine some of you have been on those projects too where the
Orm adds a lot of complexity. You have to make one small
Schema change. You're touching in 15
Places. It can create pain points.
Is anyone ever found themself creating a table like this with
Column names and intether and string value?
You're basically trying to store random metadata or maybe you've
Serialized objects and stored them in a huge string field or
Maybe you used the x amount json field types.
If you've been doing this in your sql database, there might
Be a better approach, a different approach for the data.
Now, it's a common misconception that no sql means no sql's allowed.
It's really not only sql. You could have multiple
Solutions depending on the type of data you're dealing with.
There are a ton of flavors of no sql but the four most common are
Key value, column, document, and photograph. Key value stores pretty straightforward.
It's a persistent dictionary. Highly optimized for when you
Know the key and just want to look up the value.
Great example would be a user logging into a website.
Got the user information, want to pull their preferences out.
A key value store's the perfect way. Something i run in production is
A link shortening tool. I have the short rink that maps
To the long link. Look it up, redirect.
It happens quickly. The other type is a column store.
You can have data organized in in rows but conceptually
Organized in this database as columns.
This is highly optimized to query across specific clams in
The data set. Second, if i'm projecting data,
If i have large documents and pulling back a few columns only,
I can retrieve those columns with this type of approach.
The third is document. I'm going to show you complex
Documents but it's taking an object, storing it in a database.
It can be a complex object or have a raise and asset of
Objects within it. We'll see example examples.
P the concept of a graph database is concerned with relationships.
I've vertex or nodes that are airports.
I might have longitude, latitude, name of airport.
Edgesare connections between airports. They can have information
Associated with those as well. There's a ton of advantages when
You move over to the no sql. Probably the big 11 is every
Document is independent. Not locked into a schema.
We're not going through a huge migration if we change schema.
It doesn't mean there's not indexing support.
All major document database providers will index over fields
Sometimes automatically, sometimes you provide hints and
Inform the engine how want to index. That's existent.
If it doesn't have a property, there's no index for that
Property on that document. Very straightforward to manipulate documents.
Bring them in and out. I call them json ready.
Most return it in a json format. Not serializing to a middle
Class then middle ware then translating and sentiment
Analysissing over wire picking it up and translating it. These are designed specifically
To handle replicated depending data sets that are terabytes to t erabytes to
Pet awe in size. Very large, fast queries. Version proof.
You eliminate the need to have the extra step in the middle.
The driver handles it for you. Cosmos db in the no sql
Platform. It's fully managed.
I'll talk about who it means. It covers all four of the common
No sql types. There's awe combination of what
I call proprietiary ap ikes we'll develop.
The sql api allosome to store document data but use a sql
Syntax to queries it so i can use the same syntax i'm used to.
The existing apis like mongo and cassandra
Clog migrating existing databases on o to the
Plap form. That's the way it works.
This provides turnkey global distribution. This is a database that i've set
Up that i've imported the usda nutrient database. I have nutrition information.
If we look at this and i scroll down, you're going to see i
Actually loaded the data in on the east coast and i've
Replicated to the west code. Another area, i'll come on this
Tab inside the portal and you can see alt different renal i
Don't knows i can light up. I literally, let's say i wanted
To go to south central, click that region.
It becomes availth. Save. It starts automatically
Replicating out. Easy as that. It's managed for me through
Cosmos db. Let's go ahead and say discard edits.
Next thing to talk about is this elastic scale out of storage in
Through put. Dive into it in a second.
You provision the throughput you need and pay for it up front.
Cosmos db guaranties it will meet that demand.
If you have terabytes or gigabytes or megabytes, you can
Move the mode he will based on how much use analytic your
Application gets. Transparently manages partitions.
You manage how data can be sharded and optimized as far as
Physically replicated. Define it up front.
In my airport example, maybe my airport is the partition.
Cosmos db auto meti cannily manages replicating them for you.
We have things like automate i can document expiration.
I'm tracking someone's web session and i want that to
Expire and i don't want to have to write my own service.
I can put a time to live on the document and it automatically
Expires out of the database. I talked about throughput.
I'm going to touch on it in a second. This approach allows cosmos db
To provide the service level agreements.
Low latenty at the 99th percentile.
Guarantied consistency. This is a 20, i think, talk.
At a very high level you've got different levels to work with.
Partition tolerance. What's unique about cosmos db is
It lets you set the level of tradeoff you want.
A lot of models are either strong or eventual.
There are multiple levels. You can dive more into it
Through the documentation. Just talking about the guaranty
Throughput. Has anyone heard of request unit
And scratched your head? what is a request unit?
It's a normalized way of saying this is what it takes to perform
An operation. This is the memory, the cpu and
Io it takes for an operation. You can imagine that a simple
Git operation is going to take only a couple request units,
Maybe even one. A complex query can take multiple.
There is a calculator that alieus you to estimate what your
Request units will be. You can upload documents and
Indicate reads and writes. But gives you information back.
If you exceed your provision throughput, you get information.
Q.How long to wait to pretry an operation as well as what you
Can reprovision to to meet demands.
Having said that window want to jump into the moment implementation interfaces we
Talked about. These are the five current
Supported interfaces. The sql, we have table api which
Is the key val use store, mongo db, which is document, cassandra
And gremlin. I'll show you examples.
The way cosmos pulls it off is internally stores data in a
Special form at called ars. That format is highly
Optimizing, aall the partitioning and replication.
Then it uses projection to project concepts on to the api.
I have a container which is a list of entities i'm storing.
It can be store procedures, triggers, can be user to find functions.
Then that gets projected based on which direction i'm going.
Let's try the other way. Here we go.
Backwards or forwards? let's try this again.
Forward&in time. Forward.
I don't know what's happening here.
You'll see the fastest animations known to mankind.
Now we're projecting. Great.
We made it. Mongo db collection table store
Analytic, table, gremlin which is a graph, and the same thing
When you look at the individual items, you get a document, a key
Value store or you'll get a vertex or edge if dealing with a graph.
First example i want to jump into live is the usda database.
What i've done is taken 12 relationship tables, collapsed
Them into three collection. Could have been one collection
But i wanted to keep the example simple.
You'll get a link afterwards to the example.
What it looks like inside the database is this one i'm on
Right here and you can see the checks i have listed here.
Food group b nutrient definition and food
Items. If i drail into food items, for
Example, this is going to give me a mongo db api to parse
Through data. Click on a document.
We'll take a look at the document. You can see it happens to be egg products.
You can see different amounts of food and whatever.
I want to show you this complex object. There's a nutrient doc property
That has a nutrients property with water and protein and fat
Et cetera. So that's something i can
Navigate through to on my document. What i can do then is create a
Query in mongo db. The way mongo db styles queries
Is through json objects. What i'll say is let's look for
Nutrient doc, nutrient, protein count, amount with, and i want
To find food items high in protein by weight.
What has a really high density of protein.
Greater than 80. The amount is the amount of
Grams in 100 grams of the food item.
If we do this and execute the query, it parses the query, go
To the database, and take the automated tin demo that field,
Slow down over conversation wi-fi, bounce around a couple
Access points, come in through the router, and we get seven
Results which also happens to tart with dairy and eggs because
Eggs have a lot of protein. That's showing you through the
Portal. I want to show an application
Running off cosmos db. I'm using this to connect.
I want you to take from this slide that it's a mongo db driver.
Mongo has no idea it's talking to cosmos db.
If we look at my class, this should be familiar.
Plain old c# object. This could be a javascript open
Source project, java object. So it knows what thoidia is,
Ignoring properties, et cetera. That's all part of the base
Driver. Then for my code, for example,
If i want to do a filter, i can use things like link queries in
C#, landa expressions. I can use a fluent filter
Language. This gets projected to the
Database as a queries p what is it look like actually run
Something i've got an example here. I'm going to pause and take
Credit. Did all the ui design myself,
Thank you very much. What i'll do is select a food
Group. Dairy and egg.
Search over thousands of items to find what has scrambled
Inside the text. And boom, we get eggs, so i'll
Click egg. We get the nutrient information.
So and that's not on any special wi-fi. This is over the same wi-fi
You're using. I happen to follow 100
Plant-based diet. What i might want to do is look
At nut and seed products and find out which is high in
Calcium. Calcium's important to get.
If i do this calcium nut seed products, what it's done is sort
The top 100 foods by weight so the food at top has most calcium
Content, food at bottom has the least calcium content.
It works this fast. The next api i want to talk
About is the graph api. Now when i head to include in
The demo because i used gremlin. Anyone work with gremlin at all
Or want to because it's cool to tell people you work with a
Gremlin api? we'll look at cosmos db.
This bun's configured with a graph. So instead of seeing collections
Here, i see graves. I can go into this flight's
Photograph and what it will do is automatically recognize the
Api garden grove being a graph database and lets me use the
Gremlin queries language to pick what i want.
Tell it to give me a vertex with a label of sea for seattle tack airport.
It will pull the information for me.
And then i've got properties over here on the right side.
You can see label airport. Little graph actually
Visualizing nodes for me. So i can pull that up.
I'm not sure why i'm not scrolling
Successfully. We'll input text route
Optimization i can i can tell all the airports with outgoing
Flights from sea tack. So this is an example of code
That's going to plot a course between two airports.
It says get all environments coming out of this one, going
Into this one, and find the path between two end points.
That is this little demo app. I've got the bing map control.
I came here from atlanta. I'm going to plot atlanta to
Seattle. Submit.
What it's going doing is going to the graph, asking for edges
Coming out of atv, foreigns coming into
Seattle, and optimizing those. It will speed things up.
We get the different flight paths.  Green one is the shortest distance which is a
Direct flight from atlanta to seattle, which is my personal preference.
Now, the last thing i want to show you which is kind of cool
With the graph database is we also this ark p ike that lets us
Write sql-like queries. Back to the same photograph
Database i showed you, instead of this gremlin syntax, new sql query.
You can see sql syntax, let's select top five, execute query.
We'll get back actual documents that back these graphs.
We can see there's longitude, latitude, label, et cetera.
Because this is id, i can say where c.Id equals sea.
And just using sql intax i'm used to, i get back the one
Document that happens to have the seattle airport in it.
That's using both the graph syntax as well as the sql syntax
Over the same database because of the unique way it's stored
Hipped the scenes. Hopefully you get a different
Way of looking at data and approach databases.
Hopefully you see azure cosmos has your back if you want to
Stand up a managed solution especially if you've had to
Stand up awe sql cluster yourself or georeplication.
Probably seeing a check box in the portal is an exciting way to
Do it. It's ready to go for moment
Implementation languages. Sdks across the board.
This is actual repo right here. I will pose with the slide later
After we transition. Be sure to fill out your session evaluations.
Your feedback's important to us. I'm going to finish out right at
20 Minute. Thank you so much for your time.
