Hey friends, Azure Cosmos DB
keeps getting better with a
preview of notebook support.
We've got Apache Spark in Cosmos
great new optimizations and
queries idiomatic SDKs now
on version 3. I'm here with
Kirill going to learn all
about it on Azure Friday.
Hey friends I'm Scott Hanselman.
It's another episode of Azure
Friday, here with Kirill Gavrylyuk
who is fast, becoming my
favorite person because you keep
bringing me amazing updates to
Cosmos DB thank you. Scott it's
such an honor to be here and
thank you for tolerating me for
so many times today. We really
have good news. We release the
number of updates over the past
couple months, we called them
summer updates. For Cosmos DB
studying with built in support
for notebooks. Jupyter notebooks
and spark so now you can go and
create a Cosmos DB account and
you can enable spark notebooks.
You can enable come with spark
or you can enable them later and
you can use it with any of the
APIs. And it really brings.
Makes your queries you work with
customers. Deep personal right.
So here's kind of just a quick
preview and we have another
episodes on this.
Cosmos DB comes with data
Explorer an now in data
Explorer. You have integrated
support for Jupyter notebooks.
What does it mean? it doesn't
mean not only you can do queries
like you did before, but now you
can create note books and
notebooks bring you. You can
select Kernel. Let's say Python
And now you can type your
queries. Add graphs and charts
create live reports for your
standing for example, here I
have. I have one of the
containers has data about
earthquakes. And I can
run through it quickly.
And here I'm getting all the
earthquakes within 30,000 miles
around 30 miles around Seattle.
This is thanks to these
geospatial queries that cosmos.
DB gives you right. This is just
T SQL an now I can not only get
this data, but I can also.
Visualize it using one of the
popular Jupyter extensions.
Right and now we have this
results on a map. There's such a
natural thing. Why didn't you do
this 1st it's such an obvious
natural thing. I think we had,
we had some ground work to be
done to handle this because this
is all running code containers
next to the data. But now that
we have the ground work. We can
do a lot of things we can do.
Jupyter's and other examples. We
can now could once park next to
the data. You can do and it's
your spark so you can select
what size of the worker what
size of the driver. You can get
operational insights into how
your workers are doing but it
comes with notebooks and it runs
directly to the data so you can
avoid data movement, not making
copies no silos. Yes, and you
can and not only that you can
also enable outta scale for
example, out of scale wooden
Idol an save alot of money
because spark is aware of Cosmos
DB when customers do throughput
goes down spark can go idle. We
can shrink so we can save you a
lot of money by having this
awareness. It's really great
feature. We love it. We have
another episodes on it, so will
be there. The Next One is for a
while. We had this data,
Explorer capability inside Azure
portal, where you work with data
with Cosmos DB. Anne is good for
writing couple queries, but we
also released a stand alone
version of it full screen and
recently and it's been out there
for awhile, but recently we
added Active Directory login
support in it. So now I can sign
in with my Azure credentials.
And select my account.
Right here select my
subscription. So, like my
account and start working with
it no more dealing with
connection strings. So now it
becomes a very aligned with
Azure portal. Can adjust
extension to Azure portal. That
gives you full screen space for
your queries very natural and
you've got a great
cosmos.azure.com URL for People
who live in Cosmos, who felt
like they've been seeing this
much of the data. Now, they can
see this much of the data
exactly it's very nice.
Now that's a good segue. Let's
use this tool to see some
interesting improvements with it
with queries so Cosmos DB is a
great operational database. But
for a while. We had weak spot
where our aggregate queries tend
to be expensive and slower
because all the optimizations
about ingesting data very fast,
doing point operations very fast
in real time less than 10
millisecond latency's we've paid
less attention to aggregates and
analytical queries. Now we pay a
lot more attention to that.
And with that, we were adding
improvements that gave us up to
100 X performance savings with
aggregate queries wow. Let me
show you so here I have 2?
2 collections, they're identical
in terms of data that they have.
This is the collection that runs
using the old query engin
without improvements and.
Let's do quick query.
And they're going to do an
aggregate aggregate query.
Select count.
And it takes some time to run.
Maybe it will take 20 seconds to
run right. It's not a huge
collection. It's about 2 million
documents. So eventually it will
complete came down to million
documents, but it took it took
something about 2020 to 30
seconds. Now let's take
this query and run on the
new using new query engin
that we have right here.
Instantaneous.
You sure you're not same date
about cheating same to 2
million. There's not a cache
right. It's not a cache. It's the
internal details is that before
we had to read all the documents
even for aggregates went to
count them and loaded them table
scan effectively. Yes, I mean,
not not not utilizing index. But
we loaded the data right.
Indeed, now we don't need to so
now we don't let our indexes got
much better. We don't have to
loads just to do aggregates. We
don't have to load this
fundamental engine improvements
and everybody wins. And it's
less cost, which is important
aggregate queries because of our
sense because less computer
scale. Let's just go. Let's
let's cost an much faster so
this is something that I'm going
to get for free did someone wake
up on Wednesday and then they
said. Oh, it's Azure Friday. My
everything got faster what
happened. How did this get
faster we're rolling it out so
at the beginning you have to ask
on board your account? We just
need to turn on a switch behind
the scenes. And once we feel
confident that it works. No.
Edge cases it works will roll it
out for everyone. Everything
gets faster and cheaper going to
happen. That's what we strive
for that's good stuff. Sir thank
you. We're really excited about
this finishes. The Ark of the
query and then to last but not
least, we are now seeing a few
updates around our developer
story. We talked about new is
the case that we were least a
lot more idiomatic modern better
programming model more intuitive
shorter less less.
3rd as we call it and we've done
this work for .NET, Java and
JavaScript and today we're
announcing general availability
of it. So this is idiomatic,
meaning it feels natural to the
language to the language .NET is
decay does not feel like C++
anymore. Java SDK does not
feel like .NET anymore. JavaScript
SDK let just good. Yeah, this is
good so it speaks the language
that you speak is going to feel
very natural so if you're a
JavaScript programmer and you
like working with JavaScript
you're going to love working
with JavaScript on Cosmos just
feels natural and you'll be more
productive and a lot less code
to write so that's generally
available. People can go and get
those SDK right now, Yep that's
right. That's a huge summer
update. So Jupyter Notebooks
Cosmos azure.com with ad opera
perform a percent 100X
performance improvements and GLC
case. 100X2 orders of magnitude
long doing some amazing stuff
over there, I continue to
appreciate Cosmos DB and
everything that you will do.
Fantastic I am learning all
about the great summer
update to Azure Cosmos DB
today on Azure Friday.
