Hello and welcome to knowledgehut.
In this video, we will talk about the difference
between Data Science and Machine learning.
Let us begin with understanding the origin
of Data Science?
Earlier businesses and other institutions
which dealt with data were able to store most
of the data in excel sheets.
Simple business intelligence tools were capable
of analyzing and processing this data.
The reason behind it was the presence of a
lesser amount of data.
But with the passage of time, the amount of
data available to be analyzed kept increasing.
DOMO Incorporation, a computer software company,
predicts that by 2020, 1.7MB of data will
be created every second for every person on
earth.
This is the scale of data which will be available
in the future.
Most of it will be either semi-structured
or unstructured data.
For processing the data of this magnitude,
we need more sophisticated and advanced tools
and techniques.
This is where Data Science comes into the
picture.
Data Science deep dives at a granular level
of data to mine and understand complex behaviors
and trends.
It can bring out hidden insights which can
help entities to make smarter business decisions.
Netflix mines data related to viewing patterns
of its customers and understands what drives
user interest, based on the findings, it produces
original series.
P&G utilizes time series models to more clearly
understand future demand, which helps to plan
for production levels more optimally.
Now let us understand what machine learning
is?
The idea behind machine learning is that you
teach machines by feeding them data and letting
them learn on their own without any human
interventions.
The process of learning begins with observations
or data, such as examples, direct experience,
or instruction, in order to look for patterns
in data and make better decisions in the future
based on the examples that we provide.
The primary aim here is to allow the computers
to learn automatically without human intervention
or assistance and adjust actions accordingly.
In order to avoid confusion let us first see
the fields of Data Science.
Data science covers a wide spectrum of domains
and machine learning is one of them.
Apart from machine learning, Artificial Intelligence
and deep learning are also major domains under
Data Science.
In fact, deep learning is a subset of Machine
learning.
So machine learning, Deep Learning, and Artificial
Intelligence are all used in data science
for analysis of data and extraction of useful
information from it.
Now you may ask how is Machine Learning used
in Data Science.
To answer your question, let us see the Data
Lifecycle and the stage at which machine learning
is used.
Suppose you want to create a recommendation
system for your e-commerce website.
This system recommends products to the customers
on the basis of their shopping patterns.
For building such a recommendation system
you may use the data related to customer’s
browsing history, previous purchases, their
reviews, ratings, profile details, card details
etc.
During the development process, you will go
through the different stages of Data Science
Lifecycle.
You will begin with Business requirement stage
where you will understand the problem which
you are trying to solve.
In our case, we are trying to increase sales
with the help of our recommendation system.
Then you will reach the Data Acquisition stage
where you will identify different sources
from which the data will be acquired for your
recommendation system to work.
User ratings, comments, cart history etc are
some examples.
Then you will reach the Data Processing or
data Cleaning stage.
In this stage, the raw data will be transformed
into the desired format so that it becomes
possible for you to perform operations on
it.
Then comes the Data Exploration stage where
a data analyst uses visual exploration to
understand what is in a dataset and the characteristics
of the data, These characteristics can include
size or amount of data, completeness, correctness
of the data, possible relationships amongst
data elements etc.
Then the fifth stage is where you incorporate
machine learning in data science.
This stage is called Data modeling.
Now let us see how machine learning is implemented
in the Data Modelling Stage-
First, the data gathered in the previous stages
is imported in the process.
This data should be in the proper structure.
Table or CSV formats are some preferred formats.
After this, the data is further cleaned in
order to get rid of any inconsistencies.
Then, the data model is built.
Here the data is split into 2 sets, one for
training and the other for testing.
The model is built by the training dataset.
Various Machine Learning algorithms are also
used.
In the next stage, the model is trained.
The training dataset is used to train the
model.
After the model is trained, it is then evaluated
by using the testing data set.
At this stage, the model is fed new data points
and it must predict the outcome by running
the new data points on the Machine learning
model that was built earlier.
After the model is evaluated using the testing
data, its accuracy is calculated.
The accuracy is then improved by various different
methods.
So this was the role of the machine learning
process in Data Science lifecycle.
After the machine learning stage, the final
model is deployed onto a production environment
for final user acceptance.
So we hope that now you understand how Data
Science and machine learning go hand in hand.
Don’t forget to subscribe to our channel
for more such informational videos.
