- Welcome to Azure Essentials.
In the next few minutes,
we'll explore the services in Azure
to help you analyze your data,
across both structured
and unstructured data.
Azure can help you
whether you typically work
structured tabular data,
if you're looking to reason over large
or complex unstructured big data
coming from devices,
services and applications
they require a more sophisticated
level processing and scale
beyond traditional data warehousing.
Azure has a comprehensive
set of services to ingest,
store and analyze data of
almost all types of scales,
spanning table, file, streaming
and other data types.
The Azure platform provides tools
across the data analytics life-cycle.
This allows you to ingest data into Azure,
using robust services
for batch ingestion
or real-time ingestion
so that you can capture events
as they're been generated from
your devices and services.
Store structured or
unstructured data globally
on a virtually unlimited scale.
Train and prepare your
data and data stores
to derive insights, and create predictive
and prescriptive models on your data
using machine learning and
deep learning techniques.
Furthermore, you can
extend these capabilities
to real-time processing
of streaming or log data.
You can even leverage
artificial intelligence or AI
with machine learning
and cognitive services
for automated machine analysis.
And finally, you can serve
and publish this analyzed data
to an operational or analytical store
to help with visualizing as
part reports and dashboards.
Your apps can also leverage
these data directly and securely
while meeting your performance needs.
So let's walk through these services
in a bit more detail?
The first step in data analysis
is connecting disparate
datasets from multiple sources
and ingesting them into Azure.
Your data might originate
in your data center
in cloud services or span both.
Now for batch ingestion of your data,
as your data factory
is the primary service
that you'll want to use.
This is an ingestion, orchestration
and scheduling service.
And it determines what happens
when certain events occur
and which engines to use to analyze
and optimally process your data.
It allows you to create
sophisticated data pipelines
right from ingestion of the data
to enter the processing, storing
and then making it
available for your end-users
and apps to tap into.
There are other data movement
capabilities in Azure too.
If you got a massive one-time upload
you may wanna use the
Azure import-export service
to manage the bulk
loading of large data sets
into Azure Blob storage and Azure files
by shipping drives to
an Azure data center.
If your structured data, the
Azure data migration service
migrates data from on
premises structure databases
directly into Azure maintaining the same
relational structures
leveraged by your current apps.
Azure also has engines for
ingesting real-time data streams.
Now these engines are
capable of ingesting data
at a fast pace and catering
to you processing needs
down the line.
Azure event hubs enables
large-scale telemetry
and event ingestion with durable buffering
and low latency from millions
of devices and events.
Azure IoT Hub is a device to
cloud telemetry data service
to track and understand the state
of your devices and assets.
And if you've got custom
operations to perform
and you want to scale out
your ingestion engines
with custom logic,
Azure also supports the
open source Apache Kafka
in HD insight as a
managed high through port
low latency service for real-time data.
And of course, you can use the Azure CLI
or command line interface
to programmatically target and ingest
multiple data formats into Azure.
If you're a developer, APIs can record
using Azure Software
Development Kit or SDK
to bring in your data.
Now all the tools and
services I just described
can bring data into Azure
and as you plan how you ingest data,
you'll also plan for where and how
the data will be stored in Azure.
Azure Blob storage can
store massive datasets
irrespective of their
structure or the lack of it
and keep it ready for analysis,
including video, images,
scientific datasets and more.
And as a managed service,
you don't need to worry
about the knobs and dials
it just takes care of itself.
Now if you're got a particularly demanding
analytical throughput requirements,
or you have huge file sizes
that you need to be
optimized for analysis,
you want a specialized big data store.
Azure data lake store
can serve that purpose.
It lets you analyze all your data
whether structured and unstructured
with a very high throughput,
generally desired by analytic exceptions.
It can store trillions of files
and a single file can be
large in a one petabyte size.
Now for operational and transactional data
in structured or relational form,
you can use Azure SQL DB.
This works like SQL Server,
but as an Azure service.
So you don't need to worry about managing
or scaling your host infrastructure.
Of course you can keep
existing database apps
and hosted Windows, all in
its based virtual machines.
For analytical data that's
been aggregated over the years,
Azure SQL Data Warehouse
provides an elastic petabyte scale service
which lets you dynamically scale your data
either on premises or in Azure.
Now for no SQL capabilities
if you're bringing in data
that's scheme or agnostic,
Azure Cosmos DB is a turnkey,
globally distributed,
no SQL DB service,
it allows you to use
keyvalue, graph, document data
together with multiple consistency levels
to cater to your app requirements.
Whatever the need, Azure has
an optimal store for you.
Interestingly all these
stores integrates seamlessly
to the analytics engines
as sources of date.
With your data now stored in Azure,
there are many analytics options
for training and preparing your data,
spanning super scalable
and involved approaches
to data engineering
to automated machine analytics
on serverless infrastructure.
I'll start with some of the open source
analytic capabilities.
Azure Databricks is an optimized
Apache Spark-based
analytics cluster service
offering the best of Spark
with collaborative notebooks
and enterprise features.
It integrates with Azure Active directory
and we also give you the native connectors
to bring in other Azure data services.
Azure Databricks is your
hub of Spark-based analytics
whether it's batch, streaming,
or machine learning.
Also we've got HD insight
a managed cluster service
for a variety of open source
big data analytics workloads.
Helps you clean, curate, process
and transform your data
in addition to scaling your
machine learning workloads.
Using HD insight you can
create scale out clusters
for Hadoop, Spark, Hive, Hbase, Store,
and Microsoft R server
without the need to monitor and administer
the underlying infrastructure.
The scale out compute engine
similar to traditional SQL infrastructure,
data lake analytics
actually develop and run
large-scale, parallel data transformation
and processing programs in U-SQL
over petabytes of data
from your data load.
You can even leverage the
familiarity and extensibility
of U-SQL to scale your
machine learning models
from R or Python,
to work against massive amounts of data.
Most importantly, it's a
serverless environment.
So you request and
leverage compute resources
on per query basis.
You don't have to worry about
maintaining large clusters
which makes scaling and
parallel execution easy.
Azure also has engines for
processing real-time data streams
now to analyze data logged in real-time
from devices, sensors and more,
Azure stream analytics offers
a powerful event processing engine
that together with event hubs
allows you to ingest millions of events
and find patterns, detect anomalies,
power dashboards, or
automate event driven actions
in real-time with the
simplicity and familiarity
of a SQL-like language to
process real-time streams.
Azure HD insight and Azure Databricks
also allows you to leverage
streaming capabilities
within the scale out processing engines.
Like structured streaming in Spark.
For more advanced analytics,
Azure machine learning and
Microsoft machine learning server
provides you the infrastructure and tools
to analyze data, create
high-quality data models,
train and orchestration machine learning
as you build intelligent
apps and services.
In addition to these tools,
scale out cluster technologies
like Azure Databricks
also allow scalable machine
learning with Spark ML
and Deep learning libraries.
Beyond this, we've also
built a number of first level
AI services called cognitive services
providing prebuilt intelligent services
for vision, speech, text,
understanding and interpreting.
Finally, once you've been able to analyze
and derive insights from this data
you'd wanna serve this
enriched data to your users.
Within Azure, the best destination
for all this analyzed data
is Azure SQL data warehouse
where you can now combine new insights
with historical trends
and drive a targeted conversation
by maintaining one version of data
for your organization.
Azure SQL data warehouse not only supports
seamless connectivity to the
analytics tools and services
it also integrates well with
business intelligence tools.
For example, Azure analysis
services and Power BI
which provide powerful
options to find and share
further data insights.
If the analyzed data contains insights
valuable to your end consumers,
these can be populated
into the operational stores
like Azure SQL DB and Azure Cosmos DB
so that web and app experiences
would be augmented by those insights.
You can even pipe data
directly to your apps
with Azure platform tools for developers,
including Visual Studio, Azure
machine learning workbench
or custom serverless apps and services
using Azure functions.
With Azure, you can ensure
that data's consumed
only by intended users and groups
securely authenticating
by Azure Active directory.
While network performance
SLAs and privacy requirements
met using Azure ExpressRoute.
You can even hold the keys to your data
once it's in the cloud with
Azure key management services.
So now as an overview of
the key services in Azure
that comprise the data
analytic life-cycle.
If you're interested in data visualization
and machine learning,
these topics are covered in more detail
in separate overviews on Azure Essentials.
Of course, we're constantly
adding new topics on essentials
so please keep checking back for more,
and you can continue learning
with our hands-on learning
series on the link shown.
Thanks for watching.
