Good morning
Good Afternoon
Good evening
from whichever location you are watching questpond's youtube channel
Today we will be learning the basics of Data Science course,
Machine Learning, Artificial Intelligence, Statistics, Python, R
and so on in flat 1 hour.
Guaranteed
you will know the fundamentals of Data Science, Machine Learning, AI
also know how Python and R to installed in computer, also we would have executed some statistical formulas
using R & Python.
The format of this 1 hour of teaching has chapters and has maths.
Chapters have theories and fundamentals while labs have
practicals.
There are 18 chapters, in this we have 4 labs.
We have dedicated the first 30 minutes
for fundamentals and theories.
The rest 30 minutes
for practical.
Till Lab 12 we will be discussing about various Jargons, vocabularies what we have discussed in the Data Science and the AI industry.
Then we will be executing
labs using Python & R programming.
In all of these 4 labs
will try to execute
the linear regressions statistical formula
with excel,
with Python
with R
We have demonstrated the linear regression
using all of these
three tools.
With this till the end of the tutorial we are ready with  the necessary things to start with Data Science.
Before we start with anything let us start with Chapter 1
the meaning of Artificial Intelligence and definition of Artificial Intelligence.
Data Science, Machine Learning, Deep Learning, Data Scientist, Business Intelligence, Data Analyst
and lot of these things keywords or buzz words
have emerged from one single initiative
i.e. Artificial Intelligence.
All of these things belongs to single umbrella Artificial Intelligence.
Will first define
what exactly is Artificial Intelligence.
Artificial Intelligence is an area of study
with
one main goal i.e.
creating intelligent
computer machines
which can think
work
and react like Humans.
If we see around humans have been fantasizing about this machine acting like humans there are lot of movies which have been created around same
Also we can see many applications created with the same goal.
for eg. Siri, Alexa, Cortana, Google help and so on.
Humans have tried various ways to implement Artificial Intelligence.
There are developers who have created complex rules engine
people have created
knowledge base, knowledge graphs and so on.
Even a simple If Else condition which is shown on the screen
is an attempt  to achieve the goal of Artificial Intelligence.
Can we achieve
the goal of Artificial Intelligence using simple programming e.g there is a
simple program which takes in the sentiments
and depending on the sentiments
the machine reacts back
If we put a sentiment which is good
it will react back Thats nice.
if we put sentiment bad
it will react back accordingly.
This is a simple C# program which is a simple if else condition
if we put here gold it gives this is nice
if we put bad
it reacts accordingly.
Tomorrow if we want to add a new sentiment called as Worst
we need to make some code changes, will go here add the extra lines of code
then the code knows
how to react to sentiment worst.
Can we achieve Artificial Intelligence by using simple if else condition
or by creating some kind of a complex rule engine or by having some kind of a knowledge graph and so on.
The goal of the Artificial Intelligence
is to make machines think like humans.
One of the biggest things about humans is self learning and maturing.
This is the
biggest thing about humans.
Humans learn from their parents, teachers, surroundings
and whatever knowledge they get from all of these they make a better decision.
To achieve Artificial Intelligence
code should have the ability
to learn from something.
Rather than developers changing code every time
we want the code should
try to do self learning.
Machine Learning is the process where we train our code algorithm
by using training data. We provide training data and the code
becomes better and better in terms of judgement and in terms of decision making
by learning from that training data.
in Machine learning
we have
self learning algorithms.
Summarizing
we have a big umbrella
the big goal , the big fantasy of humans of
Artificial Intelligence.
To achieve the same
we apply Machine Learning which is
algorithms getting trained by training data.
We can also summarize
every code what we write for achieving AI
is not necessary that it is a Machine Learning code.
We have AI and to implement AI we can use If Else conditions
we ca write a complex logic or a complex rules engine
can have some kind of a big
knowledge graph
databases for it.
At the end of the day Machine Learning
has a very specific quality of self learning.
Learning from training data.
There we do not do any kind of code changes.
On the screen it is flashing the big umbrella is AI
to implement that there are various ways of doing it
and Machine Learning is the most
prominent way of achieving AI.
Let us spend 2-3 minutes in understanding
in depth Machine Learning process.
We have named this Machine Learning process as
the TAMA process. TAMA is the acronym of the different phases in Machine Learning.
The first Acronym is T A  M and A.
We have the step 1, step 1 in Machine Learning
we use training data
and train and algorithm.
In Step 2
we take training data and run over Algorithm. Step 3
the output of running training data over algorithm we get an output called as
the Model.
In Step 4 we assure
that this Model is
as per the specifications.
This
TAMA is a process wherein first we have training Data
second take training data run over the algorithm
and the output what we get is the model then we assure this Model is as per our requirement, upto the mark
whatever inputs we are giving to him it is
actually giving us proper outputs.
Incase this model is not right
then we iterate
step 1.
In order to setup this
TAMA process
to setup this whole Machine Learning
infrastructure or Eco system
we need to have knowledge from multiple fields. We need knowledge  which is  multidisciplinary.
Add a column here What
skills
do we need?
In order to organize training data  we need knowledge of RDBMS
knowledge of ETL how to extract it load it
We will need
a proper IT knowledge of understanding data, loading data
massaging data and so on.
The second step is Algorithm
over here we should have
some understanding of Maths and
statistics.
Step 4 - Once the model is created we need to test the model.
We need to check the model is right or wrong.
For this we need proper domain knowledge
for which we are creating this Machine Learning process.
When we want to implement
a proper Machine Learning infrastructure or an eco system
we need multidisciplinary knowledge, IT knowledge
Maths and statistics knowledge
domain knowledge and so on.
This is what
is the
area of Data Science Course.
Data Science
is a field which focuses on extracting information from data.
We have a sample data
Data will of huge size and with millions of records
because we are learning will concentrate on this simple sample data and understand how Data Science functions, fundamentals of Data Science first and then we will look at a larger record set.
This sample data
is a recorded reading of
how many people can lift
how much weight?
If it is 3 people
they can lift 63 kg, 10 people can lift 203 kg and so on.
We need to figure out the relationship between no. of people
and weight lifted
If we know the number of people
we can estimate how much weight can be lifted
and if we know the weight
we can estimate how many people we need to hire.
To figure out the relationship between
weight and no. of people
will use some Maths
or to be specific
we will use
statistical maths
and to be more specific
we will use
regression and analysis and to be exactly specific
we will use Linear Regression.
What is Statistics, regression, Linear Regression
in the later classes we will have dedicated chapters for maths.
Do not get discouraged how much maths we have to learn and how much maths we should know. In the later labs
we will learn the necessary maths which is needed to be
known for Data Science.
For now note that Regression is a part of statistical maths
which helps us to determine
relationship between two data
We have
these
two columns
Regression analysis or the Linear Regression will help us to figure out
what is the
relationship between
both of these columns.
Linear Regression says
if we have two variables
and these variables are scaler and continues numeric in nature
they have numbers
and if there is a relationship between these two data
that relationship can be expressed by
the Linear
Equation
Y=MX+C
This is the Y which we want to figure out
and this is the X
Linear Regression says
if we have two variables which are scaler numeric continues
then this equation
will help us out
to come out with the relationship.
In this equation M stands for the slope
and C stands for the Intercept.
We want to figure out
the Y
To figure out the Y
give the X
and calculate what will be the M & C
This M
is a slope. In order to calculate the slope will use the slope formula
which is given in excel.
To calculate manually we can see on the screen
it is flashed how to calculate the slope.
We will ease our task by using the excel formulas.
These are the known Y
Slope formula takes two inputs
Known Y and Known X.
that is the slope calculated
To calculate the intercept we need to give the known Y and known X.
On the screen we are flashing the calculation and if we wish to do it manually
take up a calculator and try to see
if the values are matching
Y says
whatever is the
slope M *
X+
the intercept.
We have come out with a formula which says
Y=
20X+3
If we put 3 here
it is showing
the weight as 63 which is right.
If we put 10 it is showing 203.
The values are matching
by the data what is given
This Formula looks right.
If we try to visualize
what are the
steps taken
in the last 3-4 minutes.
We had the historical data of number of people working VS Weight lifted.
We use this Historical Data and to figure out the relationship between the data
we use maths.
We use statistical maths and linear regression
To calculate Linear Regression we used  our excel skills, IT skills of excel formula. To calculate the slope
and to calculate  the intercept
we used the excel formula.
We also used our domain expertise to ensure
whatever is the output coming
It is right or not.
We used
general domain expertise to check
when we give the number of people weight lifted is coming right or not.
This is termed as Data Science.
Data Science is a multidisciplinary field
means
it involves IT skills like
Simple excel formulas, Complex python program or R Program.
It involves
Statistical maths
and Domain Expertise,
Data Science is multidisciplinary field which focuses on extracting
information from data. This is the data and this is information we have extracted.
In this multidisciplinary field
we need IT knowledge
we need
statistical maths
and we need
domain expertise.
There is one more important concept
which we need to understand and we need to understand the differentiation between them is
Supervised Learning and Unsupervised Learning.
Both of these learnings are
part of Machine Learning.
As the name says Supervised
we know upfront lot of things.
We know the inputs, outputs
we also know what functions to connect
the inputs and the outputs
The only thing we do in Supervised learning is
we try to make prediction of the function better.
In this example we knew
these two inputs very well.
These two inputs are connected
by the way of Linear Regression.
We knew the inputs , what kind of out we want
and the only thing what were doing out here is
we were trying to make
slope and intercept better.
We were trying to make
these two things better
and that will help us to make our function more
better in terms of prediction.
In Supervised Learning
we know the inputs we know the outputs
also know what kind of
functions can connect the inputs and the outputs.
Only thing what we do is
we try to make this function better.
If we have data set which is something like this
and try to see what we can do from this data set.
In unsupervised learning we only know the input
we have no idea what can be the output
and we try to hunt a pattern.
In Supervised Learning
we know the function and we try to make that function better
while in case of Unsupervised learning
we only know the input and we try to
hunt a pattern.
From the Algorithm prospective in Supervised learning we have Algorithms like Regression, Linear, Classification and in Supervised we have Algorithms like Clustering, Association, Means and so on.
This excel sheet can be downloaded from  the link given in the video tutorial.
In this excel sheet we have Practice1 and this has
Linear Regression so this is Supervised Learning.
In Practice2  sheet apply Clustering.
Practice2 sheet
is example of Unsupervised learning.
At this moment we are using excel
to demonstrate the
ability of Machine Learning,
to demonstrate some of the
statistical
part here
when it comes to large data set
excel is not really capable of it.
At that time we will definitely need to use Python,
R and so on.
In the further coming lectures we will cover Python as well as R.
First try to get the fundamentals
if we do not understand all these vocabularies like
Supervised Learning, Unsupervised Learning,
Data Mining, Deep Learning, Machine Learning and so on.
If we do not understand the vocabulary part of it
it can become very confusing when we start R & Python.
As we move ahead
we will be covering
all these things using Python and R.
This is the unsupervised learning data sets and
the whole goal of unsupervised learning is
to hunt out a pattern.
Before that we should know is there any relationship between
these columns.
Is there a relationship between Age and
Shopping Amount or
between Age and Weight. Is there any kind of relationship?
once we know that there is a relationship between the columns
then we can use only those columns to hunt pattern.
At this moment
eyes we can figure out
there is a relationship between the Age and Amount.
But if we have million of records
it is difficult
humanly to
know if there is any connection between two columns
There is something called as Correlation coefficient
It is again a Statistical
equation from maths
which tells
how close  are two columns related or two data set related. Give it the Array1
and Array2
In the same way apply Correlation function for
We can this more dynamic by using
some programming language
We are doing it in a hard coded way
in Python we can read all the columns and do it more dynamic.
Now the Age and the Weight we have got the correlation coefficient as point 0.35
Age and Shopping is 0.9 and the Weight and Shopping is 0.3
The Correlation coefficient says
the more we are nearer to 1
the more there is a tight relationship.
From here I can now make out the Age and Shopping Amount
is having a very
deep connection.
We should be trying to hunt pattern
between the age and the shopping amount
The other 2s are 0.30 and 0.30
This one is
quite high.
These two columns have the
highest affinity of
getting pattern.
At this moment it is excluded
from pattern hunting mechanism.
We should only concentrate on these  two columns.
To apply clustering in excel or to use the clustering algorithm in excel.
There is no straight forward we are doing it. We do not have any algorithm formula saying cluster and fin out.
In R and Python there are
definitely better ways of doing it
We will be using the scatter plot
will try to
see the patterns visually.
Data Science is an art
it is not that
we have to use only R or only Python
only statistical maths it can also see things visually
understanding
how data is and
how we can extract information from the data.
Data Science is an art
which takes lot of tool sets,
says we have to understand Maths, Statistical modeling
At the same time
we can use an orthodox ways of doing it.
Sometimes data is so weird
probably we will end up with unorthodox way of detecting patterns of
extracting information from the data.
We are using the scatter plot to detect the pattern.
Select Data --> Add
the X value
and Y value.
There is a
small group of
dots. Select these dots and make them much better.
We were trying to get rid of these grid lines so we can see this clustering more visually.
There is a small cluster forms.
We can get rid of this gaps out here.
Have removed the areas where there is no data.
We can clearly see there is a cluster .
From all these hunting
for all these pattern we have come out with a category.
We have one category which is specifically
of a customers who are of the age 20
and they shop between 0 to 50.
This is category1
What we are doing here manually
you will be  doing through a program.
The other category is like 35 to 40 age and
35 to 40 is Category2
We have 40 to 50 age they shop between
200 to 220. This is Category3.
This is an example of unsupervised learning.
We had a data
we ran clustering algorithm over it.
We did not run any clustering algorithm
we did it visually for now.
When we do Python we will see how we can
run clustering algorithm through a formula.
There are three clusters which we detected.
Now we have a better understanding of the data.
We can name this category1 as
Normal customers because they buy less amount.
The category2 are like Gold Customers and the category3 are Premium Customers.
In unsupervised learning
we tried to hunt out a pattern.
In supervised learning we tried to make formula or function much better
so we can do better prediction.
Will discuss about one more confusing vocabulary
Data Mining.
Mining in English means to hunt something.
In the same way
Data Mining in AI means to hunt a pattern in a huge data set.
Data Mining and unsupervised learning are synonyms.
When someone says
this is unsupervised learning then he also means that this is data mining
and data mining means unsupervised learning.
One more vocabulary
which is discussed a lot is
Deep learning
If we want machine to really work and think like humans
then we should thinking like human brain.
Also technical architecture
should replicate
the human brain architecture.
If we see a human brain the way information flows throughout the body
it uses something called as the neurons.
Neurons detect what is going around us and inside us
and they decide how the body should act
depending on the input.
Human brain has 86 billion neurons
which has layers of network
Every network doing some specialized task
and passing the output of that network's task
to the next network
and finally telling the body how to act.
This is termed as
the neuron network.
Deep learning
tries to mimic human neuron network thought process.
It has multiple layers of thought process
and every layer
focuses on specific task.
It passes the output of that network
as an input to another network
and finally arriving to a decision.
For instance
we have created a deep learning
logic for
detecting bank fraud transaction.
we can have first layer which will detect the history of the account holder
whether the account holder has some
previous fraud history.
The output of this layer
will go to the input of the next layer.
The next layer
is dedicated to check for absurdness means
is transaction coming from a far away geo location
are the timings weird
is the transaction amount weird and so on.
This layer we check for absurdness
and this will be passed to the next layer
and so on
until we finally conclude
that this transaction is fraud or not.
Deep learning has a neural network
and it is one of the forms of machine learning.
We can also say
deep learning is  a subset of Machine learning.
Will try to summarize what we have learnt in the past 30 minutes.
Here is the 30000 foot view of the overall things we have discussed in the past 30 minutes.
At the top we have
the important mission main goal
AI where we want machines to act like humans.
All of these things - Data Science, Machine Learning
Supervised learning, unsupervised learning, data mining everything has
started with that one goal in mind
where we want machines to think humans.
AI can be implemented by simple if else conditions, a knowledge graph, various coding techniques
Machine learning stands
at the top because in Machine learning
if we do not believe in code changes but we want the code should learn from the data.
In machine learning
we have
the training data
goes and trains an algorithm.
From there we get something called as a Model.
This Model we test by using the Test data and if we find that the model is not appropriate
we again
rehash training data and may got training data better.
and again we try to
execute this cycle.
This whole cycle is termed as
the TAMA process. We have training
the we have algorithm
then we have the model and then assuring
that this Model is right.
Again there are
two subcategories in Machine learning one is  Supervised learning
in supervised learning
we have the training data
we know the inputs
we know the algorithm, we know what kind of expected output we want
The whole goal in supervised learning is
to ensure that
this function or algorithm becomes better in terms of prediction.
We had the equation of Y=MX+C
in supervised learning we were trying to make the M & the C i.e. the
slope and the intercept values much better.
This is termed as supervised learning
where we know the inputs
the outputs
and the function.
Unsupervised learning or Data mining is where
we try to hunt a pattern.
Once we figure out the pattern
we can give it as a feedback to a supervised learning
or we can make algorithm better in supervised learning.
Unsupervised learning is all about hunting a pattern
and supervised learning is all about making your functions
much better in terms of prediction.
Then we have the other side Data Science
Data Science is all about extracting
information from data.
Its a multidisciplinary filed wherein we need to have IT knowledge
Database knowledge, SQL
programming knowledge like Python, R,
excel knowledge,
domain knowledge, statistical maths knowledge
Its a multidisciplinary field where we take all of these things and extract information from data.
This information
will be fed into the machine learning ecosystem
to make machine learning better.
From this information
we can come out with the equation
or come out with a formula
and that formula can become a part of an Algorithm.
From this information
we can probably hunt a pattern
that pattern can become as an input
unsupervised learning.
Data Science
tries to extract information from the data and that can become a part of a Machine learning.
This is a 30000 feet level of how
AI, ML, Supervised learning, Unsupervised learning,
Data Science everything fits in.
The next question is
Can we implement Machine Learning, Artificial Intelligence, Data Science using an excel?
the answer is NO.
We can learn and understand the concepts of Machine Learning using excel
in our tutorials will use excel a lot but only for understanding purpose.
Production level work
it is not feasible with excel because excel can handle like
100000 records.
Data int, ML in Data Science
in Machine learning
it is of huge size like millions or billions of records.
We need a proper database
a proper programming language to process such huge kind of data
Thats where
Python comes to rescue.
The next question is why Python?
Why not C++, Java, C# and so on?
There are so many good languages out there.
They are for years together, they are back by big organization
so why Python?
If we see from pure programming language aspect all languages are good and equivalent.
If there is a famous programming language like C++, Java, C# and Python
everyone is great because
or else they won't have rank as the top languages.
They all have equal integrity.
When we talk about Artificial Intelligence, Data Science, Machine Learning we need some extra capabilities.
This is not like developing web application or desktop application it is different.
When we say Machine Learning
we need
a ready made algorithmic functions we don't want to write linear regression from scratch.
We want all the statistical
function should be in build, time tested, there should not be any errors
we need a way to handle large data sets
display graphs and patterns
More than programming we need lot of other features which
is needed for implementing good machine learning
All these features we have termed it as a LAMP feature.
L stands for handling large data sets
A stands for ready made algorithms
M stands for manipulating large data sets
and P stands for displaying graph and patterns.
Python has lots of
frameworks and libraries
We have SciPy, Pandas
There are so many
ready made libraries out there
which makes Python a great language for Data Science.
If we talk about C#, Java and C++ they do have some maths functions but i.e like logs, square root and so on.
They do not have Algorithmic function like
time series
linear regression
Thats all there in Python inbuilt.
That makes Python
the language of Data Science, the language of AI and Machine.
To install Python go to python.org website and from downloads
The first thing will install is Python framework.
To install it
go to python.org --> go to downloads
The recent version is 3.7.2
My operating system is windows so installing Widnows version
incase it is Mac then download accordingly.
incase it is Linux or other platform do accordingly.
If we click on downloads out here
it again takes to a much larger screen where we can say click on windows, click on Linux
If we click on Windows it shows the latest release at the top
after clicking on 3.7.2 it takes us to this page and gives lot of options
There is mac 64 bit installer
mac 32 bit installer
Windows 64 bit installer
Windows 32 bit installer
Depending on
what kind of operating system we have, should install the appropriate installer.
This is a Windows version
and the machine is
64 bit.
Once we install the Python framework
we should be able to see
this thing out here.
Python  desktop app
It opens up something like this
Over here we can type the Python commands
For eg. - print
this is my first
python code
that should print it.
This small command window
we can test if python is installed and it is working.
The first step is
install the Python framework and ensure it is installed proper.
When we talk about enterprise projects, big projects  we cannot code using interpreter like this.
We really need
an IDE, Intellisense, Editor
For that we have
this famous tool which is specifically for Python which is
used by the community of Python a lot large
which is the PyCharm.
PyCharm is
is created by Jetbrains
In PyCharm we have two downloads one is the
professional edition for this we need to pay later on
and the other one is the community edition
this a free open source.
Will download the community edition
once we download it we get something like this
Search for Jetbrains PyCharm Community Edition.
First we need to install the Python framework
and test that if the Python framework is running by this terminal.
once we are done with that the next thing we need to
install PyCharm
from Jetbrains
This PyCharm can be downloaded from jetbrains.com or google PyCharm download
will get the first link which points towards the download of PyCharm.
We have created some basic projects
and the next thing is the forms are not really looking great.
Make the forms bigger
go to settings
Appearance
To code anything
using PyCharm first create a project
in directory give the name as
Pythonlab1
and hit Create.
Will open this project in current window.
It is creating an environment, it is bringing all the necessary
references what we need to create and execute the Python  project
It has created
the Python project right out here.
Now we need to add some code here.
add a python file
Right click on the Project --> Add --> New---> Python File.
This is lab 1
All Python code have extension.py
Will reiterate the statement when we started
The first one and a half hour
we are not really trying to go indebth in Python, Algorithm,
R, SciPy
We are trying make sure after this one and half hour of video we are getting acquainted with the ecosystem of AI, understand the vocabulary, differences between ML AI DS, understand Python, what is SciPy, R
which are the necessary tools from where to download.
We have dedicated labs to learn Python,
In the comments parts
if we type print we see the intellisense right out there
This is my first python code
The  small things says i want a blank at the end of the file thats how Python interpret things.
Press enter and the error/warning goes off.
To execute this code click on Run
Down below it says
Which file you want to run. We have just one file so will run that file.
We have small command window showing the output. This is the first Python code.
In the same way we can declare a variable here x=0, that is how we declare it in Python.
This the first Python code
Will increment the value
again there is a small warning here - want a wide space here and here
here will say class display me x
This x is numeric, at the top we can see the way python works is x is numeric and we cannot concatenate out here.
and it is concatenating, saying this x is numeric and
please convert it to  a string, for converting to a string we have the str function
Use the str function and run this from here or else run from right click menu
We can see "this is my first Python code and also see the variable value.
Will try to implement the linear regression using Python.
Can we find out the slope and the intercept using Python.
In the previous part of the video Python is a
language
This language has lot of libraries.
To do scientific calculations
scientific algorithms
There is a library called SciPy.
SciPy is a library which has all those things like linear regression, statistical formulas and so on.
First thing we need to include
SciPy into the project.
Go to File properties --->
Setting and expand the project ---> Project Interpreter.
The Project interpreter helps us to refer the packages
Add
SciPy
Install package
Once installation finishes it will show as installation done.
This takes a big time , SciPy is a very big framework and a package.
it will take some time to download
In the project if we see the libraries
we have SciPy. Scipy uses numpy also and other things.
First we need to refer
SciPy
to refer SciPy in a project use this import statement here - from SciPy
import this in the project with the name stats.
This is the variable name by which we will refer the SciPy library and the values of the no. of people and weight lifted
have put it into a array.
Over here
in the next statement we have imported the values into arrays.
This is provide your x
and y values.
Here are some warnings saying
this is an unused import statement in other words we did not use
stats.
Now we need to
take this stats,
x & y
and
try to execute the linear regression.
This stat has a function called as Linear regress.
This will help us to execute the linear regression
Here will execute linear regression and in that the first value is x array and the next value is y array.
Whatever is the output put it into these variables.
We have created 5 variables
One is in which we will get the slope then the intercept, r value, p value and standard error.
Will print the slope and the intercept.
This is the goal
In Python we do not have to say var x var slope, the time we say slope and assign the value
the variable is declared and created.
It says this is numeric and cannot concatenate it. We need to use the string function.
Later on we will be looking into all syntaxes, now will execute it to see the output and see how Python and SciPy work together.
Again will run lab1 and the output says
the slope is 20.0
and intercept is 3.0
From the formula we had calculated slope as 20 and intercept as 3.
It is matching
to the output
to the values which we derive from intercept.
This was a very simple example of how to use Python, SciPy and
implement data science.
One more programming language which has gain momentum is R Programming.
R is a very specific statistical programming language created by statisticians for statisticians.
It is not a general programming language. When it started it never started as saying we will be creating desktop applications and web applications.
It is not a general programming language like Python, C++, Java and C#.
We can create web app and desktop app using R
there are other languages like Python, C# and Java who can do better work.
Will do the same demo of linear regression using R.
How we did for Python , installed framework and installed editor. In the same way for R will first install the framework and then on the top we have use an editor or studio.
Google R download and go to the first link and download R for Windows.
Once we download it
we will get a small  R gui, this is not editor but a terminal where we can execute R syntaxes.
In further coming videos we will be having separate session for R but just to make sure we are able to understand linear regression formula, linear regression code.
To create a variable in R say str and then
the sign out here says in this str assign variable test.
That's how we create a variable.
To print this variable say print str
For enterprise project we need a studio and an editor so download R studio in computer.
Go to R studio and download free R studio desktop open source license,
Once it is installed we will get
this icon.
We can write R code using R Studio
Create a small R script
and write  the code linear regression using R.
Do the same linear regression problem which we discussed with excel and with the same demo using Python and now will do the same demo using R.
We have written 4 lines of code, this will help us to
get the slope and the intercept
using R.
First we have defined two variables or
single dimension array x & y.
This is how we define array in R
c is a function passing the array values and it will create a vector a simple dimension array.
We have created
the array of no. of people and second is y which is weight lifting.
Then create a variable called as relation using linear
regression modeling. The lm is a function a function which invokes
linear regression.
This y will be calculated on the base of x.
This small tilde sign indicates y will be calculated on the basis of x.
R is a typical statistician language
Programmers will say ,x
like when we say statisticians they look at
the left and right side of the equation.
They will look at dependent and independent variables.
Y will be calculated from x so this tilde sign indicates and trying to change the relation.
All of these codes are available as download on the website
Go to R script we had created
and execute it.
Select this --> Run
Down below
it has displayed
3 and 20 which is matching
to excel values.
This was a small demonstrations of
how to use R & R Studio.
That brings us to the end of this video training
we have covered all the 18 chapters which we discussed and promised when we started this training.
Even
completing this 1 hour of training is not a small task so gifting this e-book which has a Data Science interview Questions of 150 pages which has almost like 100 most asked questions on Python
from Data Science aspect.
To get this ebook
go to the link of data science training institute in Mumbai Saki Naka, India and put a feedback of this 1 hour of training on this link.
After putting feedback email us at questpond@questpond.com to get the ebook download.
We are going to flash a revision screen
wherein in one liner will summarize whatever we have learned in this 1 hour of video.
Thank you very much
