Coming up we take a look at SQL Server 2017
which brings the mission-critical performance security
and availability of SQL Server
to Linux and containers.
We're going to show
you how the SQL Server that you know
now runs on Linux and the new platform
abstraction layer
and how well that performs.
We'll also show you Active
Directory authentication on Linux
and the high availability options as well as
failover between Windows and Linux platforms.
And stay tuned too, as we'll
show you support for graph analysis
to identify new relationships in your 
data and more.
Microsoft Mechanics
I'm joined by Travis Wright from the
SQL engineering team, welcome.
Yeah, thanks for having me back.
So back on the Mechanics last November
you launched SQL Server 2017 in preview.
Now that the product is complete, what's new?
The main focus for SQL Server of 2017
is to bring SQL Server
to the Linux platform and containers.
At the same time we continue our cloud first development model.
So all the new features we've been
adding to Azure SQL database recently,
those will now show up in SQL Server.
Things like SQL Graph.
And since the preview in November we've
been adding a bunch of enterprise capabilities
to SQL Server on Linux.
That includes high availability,
security features including Active Directory
authentication,
and we've achieved performance parity between SQL Server on Windows and Linux.
So, I bet if you're watching this you're probably 
wondering did you have to actually
rewrite SQL Server in order to bring
 it to the Linux platform?
Yeah fortunately no, 
we've been actually trying to work on
optimizing SQL Server and more
 directly talk to the hardware
for a while now anyway.
So if you think back to prior to 2005,
SQL Server like other apps leverage
Windows for hardware resource management.
Then in SQL Server 2005 because we
wanted to have
better control over the hardware resources
to improve performance and scalability.
we brought the hardware resource management into
SQL Server user mode process
so that we could better control that.
And that's what we call the SQL OS layer.
And that's responsible for 
processor scheduling,
memory management, Network and disk i/o.
And at that point SQL Server
really no longer had a dependency on
Windows for Hardware resource management.
A few years ago Microsoft research
identified about 50 application binary
interfaces or ABI's
that all of the Windows API surface area sits on top of.
And it just narrows down to those 50 ABI's.
So to bring SQL Server to linux,
we took a small number of Windows Binaries
that SQL Server actually interacts with.
And we then brought that together with 
the SQL OS layer
and packaged that together and brought it to Linux.
And that what we call the SQL platform 
abstraction layer.
And then as part of the SQL platform abstraction layer,
we intercept those ABI's that are in Windows
and we remap them onto the ABI's of the host operating system we are running on.
And that's what we call a host extension.
And today we have one for Linux and
 we have one for Windows.
And the result is that regardless of which
operating system SQL Server is actually running on,
it thinks that it's running on top of Windows.
Right, another benefit here is that you 
didn't have to rewrite SQL
and there's not really any emulation running .
Yeah exactly it's the same code that we've
 always had and same high quality.
And the other benefit is that as
 we introduce new features
and as we fix bugs, that'll show up on Windows
and Linux at the same time.
An example of that is SQL graph
 which we'll see later.
That was actually developed by developers running SQL Server on Windows
but it just works on SQL Server on Linux.
And then we optimize the
code pass through the SQL PAL
so that we get the same great performance we've
always had with SQL server
whether it's running on Windows or Linux.
So can we see then how the two 
platforms compare in performance?
Yeah definitely, let's take a look.
So here I've got a couple of virtual 
machines running Azure.
One is running Red Hat Enterprise Linux and
 the other one is Windows.
They're ds-11 v2 VMs with two cores on
 14 gigs of memory.
Over here in my terminal
window I have a bench marking application
that we use to put some load on the
server and measure the performance.
So let's kick this off in both of these windows here.
The one on the left is targeted this
 VM running Windows.
And the other is the VM running RHEL.
So you can see that the connections 
start spinning up here.
And once all the connections are 
created then it'll start measuring
some transaction performance of both doing some inserts as well as some selects.
And here you can see the milliseconds
 of how long it takes to do a particular transaction.
And the rate of transactions per second.
You can see that across
SQL Server on Windows and Linux
here the performance is 
approximately the same.
Right, so let's just give it a couple more seconds here
And now we can see that the result here is 8,914 
transactions within that time frame.
And over here with SQL server on RHEL it's 9,229.
So within a couple of percentage points.
And you can run this test all day long.
It's kind of fun actually to do
a comparison between the two.
And sometimes the SQL Server on Windows
will win or sometimes SQL Server on Linux will win.
But the performance is
always within a couple percentage points.
So pretty fair race.
Then this is a basic performance demo.
Any other benchmarks that we have?
Yeah of course we want to do more sophisticated and and official benchmarking testing.
And so to do that
we run TPC benchmarks.
That's sort of the industry standard.
So in fact we actually just posted recently
a new world record for the one
terabyte data warehousing benchmark
with SQL Server running on Red Hat
Enterprise Linux.
And as you can see here we actually beat the
previous world record holder
which was SQL Server 2016 on Windows .
And so what you can kind of infer from this is
that SQL server 2017 is actually
faster than SQL server 2016.
And that's through some improvements that
we've made in the query processing engine.
So not a platform to platform thing,
more of a year of progress on SQL Server?
Right, so the demo shows you how SQL Server on Windows is comparable to SQL Server on Linux
in terms of performance.
And then this benchmark shows you how SQL Server
2017 is actually faster than SQL Server 2016.
So the SQL platform distraction layer 
really gets you a long way
in terms of where you need to be.
But I know there are some things that 
we like to take for granted
or sometimes take for granted windows such as Active Directory based authentication,
or clustering services.
And those aren't the same in Linux,
so how do you solve for those?
Yeah for Active Directory authentication,
once your SQL Server
is on a Linux host that is domain joined
you can connect to it just like you can
on any other SQL Server.
So let's take a look at that.
Here I am on a Windows PC and
I've got SQL Server management studio open.
And I will just connect to a
database engine here.
And here we'll just connect to RHEL1
which is a RHEL based instance of SQL Server.
And you can see the here that I'm connecting with my
Active Directory authentication credentials.
Just hit connect and I
connect to it just like I would any other SQL Server
that's on SQL Server on Windows.
How about clustering?
For clustering we leverage pacemaker
which is a cluster manager in Linux world.
And that's basically the same as Windows
Server cluster services.
So let's take a look at what that looks like here in
SQL Server management studio.
We have support for always-on availability
groups as well as
normal failover clustering with shared disk.
So here we'll open up the always-on 
availability group dashboard.
And we can see that we have an always-on 
availability group for this server
and there's three RHEL based
nodes in this always-on availability group.
And then over here I have a simple
application that just connects to the cluster name
for that cluster.
And just outputs the name of the cluster node that it's currently connected to.
So now to simulate SQL Server going
 down for some reason.
Let's just go over here and
stop the SQL server service.
And we'll give it a few seconds here.
That will eventually kill off the 
connections that are coming in here.
And now we can see
that it's trying to reconnect.
Okay, so now we can see that the connections are
starting to come back online.
You can see these are now connected to Rhel2
which was now taken over as the primary node within that availability group.
So out of personal curiosity,
this probably doesn't work, but
can I have an always-on availability group
between Windows and Linux host?
Yeah actually you can, we've
had that question quite a few times.
We don't recommend it for an HA scenario
because there's no cluster manager that
spans Windows and Linux yet.
But it's great for a couple different scenarios.
One is where you have a need for like an
OS level redundancy.
So imagine a scenario where you have a always-on
availability group that spans Windows and Linux.
And there are some zero day
vulnerability you have to patch on Windows
And so you would maybe failover
your entire work load to your Linux side
and patch your Windows and then bring it back.
Another scenario is migration.
Where you have a workload running on SQL
Server on Windows today
in an always-on availability group.
You add a few Linux nodes to that cluster,
replicate the databases over.
All this happens online right?
All your applications continue to work.
And then when you're ready you just failover.
And that's how you can migrate with basically just a few seconds of downtime.
And that would work if I'm going to Linux
 or back to Windows?
Yeah, either way.
We just seen a lot of great capabilities coming to Linux,
but are there any new capabilities that are coming to both Windows and Linux?
Yeah, there's really two main things.
The first is graph support which we've added.
And then also support for Python for our machine learning services.
Today people need to typically have a dedicated database for running a graph database.
Something like Neo4j.
What we've done with SQL Graph is we've actually
brought graph database capabilities
into SQL Server in SQL Server 2017.
And that helps you kind of get those inferences out of the many relationships
that you have in a graph type of data model.
So I'm going to jump back to my Mac here and show you what that looks like.
So here I'm going to fire up a container
running on my macbook.
And here I've got Visual Studio code open
and I'm going to connect to it using 
the new SQL Server extension.
We can see that we're now connected to the master database in my container.
So let's just create a quick database here that we're going to store some graph data in.
To create a node table in SQL Server
it's just like creating any other table 
in SQL Server
except that we have this as node syntax here.
So I've created some tables here to store some information about my Factory.
Now I want to create some edge tables.
And you can see the creating edge tables 
is also the same
as creating any other type of table in SQL Server
except we have this as edge syntax here.
So now let's insert some data into these tables.
And you'll notice just by looking at these queries
it looks just like inserting data
into any other SQL Server table.
Except that with these edge tables we're
inserting the node ID
for each end of the edge into the table.
We're just using T-SQL, here?
Yeah, just T-SQL.
The syntax is essentially the same except for a few new keywords that we've added to the language.
Now that we've got some data in there we can actually start to write some queries
which go and look at this graph data.
And you can see here that the graph query
syntax makes it much easier to write a query
as opposed to having to write a
big long set of join statements.
And you don't have to worry about what the
difference is between an inner and an outer join.
The syntax here is really just much easier to use.
So one of the benefits of this is that 
you get your graph data
right next to your traditional relational data.
And you can join between the two different data sets
as opposed to you know having your graph
data off in some separate database
and trying to figure out how to bring that
together with your relational DB.
The other advantage is that you get all the
power of SQL Server with your graph data.
So if you want to have some of the
advanced features for security
that SQL Server has you can do that.
Or, as we'll see here I can create a 
clustered columnstore index
on the machine table here.
And that improves the performance of 
your queries by 30-100  times
by using our new column store feature.
That is an in-memory technology that also compresses the data on your disk to save you a lot of space.
That's a huge performance boost.
But you also mentioned Python support 
is also available
now on SQL Server 2017
for building intelligent data models.
Right, so in SQL Server 2016 we 
introduced support for R.
And now in SQL Server 2017 we've added Python to our machine learning services.
And those are the two main languages the data scientists use to analyze data.
And the interesting thing that we've done here is we've actually put R and Python
in database so you can actually 
execute these R and Python
scripts where the data is at
instead of having to pull the data out 
and process it on a different server.
So in SQL Server 2017 we also added
the ability to offload these scripts to
GPUs for even faster performance.
And that's available with SQL Server on
Windows today
and we'll be adding it to SQL Server 
on Linux in the future.
A really great, fast tour of SQL Server 2017
including the new Linux capabilities.
But, if you're watching at home where
 can folks go to learn more?
There's so much more we didn't even
 have a chance to talk about today
like running SQL Server on Windows 
or Linux VMs and Azure.
So go check out more at the link below.
Thanks Travis, and of course keep
watching Microsoft Mechanics
for the latest tech updates across Microsoft.
We'll see you next time.
Microsoft Mechanics
www.microsoft.com/mechanics
