Welcome to this introductory
course on Data Science.
Data science, in simple terms,
is using data to draw an inference
or predict an outcome.
Such information can help us
make better decisions.
For instance, say you want to
choose your next destination
for vacations.
You would like to find a perfect
place that fits your budget,
your activity or interest,
recommended trip duration,
language, food and cultural
preferences, security and political
scenario and other factors.
Considering all these factors,
let us say there is a tool
which can predict your next dream
vacation location.
Can you guess
how that tool might work?
A simplistic one would have
all these data points for say
a thousand locations across
the globe.
Based on your inputs,
it will come with a cumulative
score, which we will call “Vacation
preference score” or VPS for short.
It will recommend you locations
that are close to your VPS.
Sounds simple enough.
The complexity lies in collecting
relevant data for all these locations,
cleaning it, quantifying it,
and calculating VPS.
A location can be cheap
or expensive based on average
one-night hotel rate.
This
data can be downloaded from a hotel
booking website.
The political
scenario can be found from news
websites using Natural Language
Processing tools.
The recommended
trip duration can be found on popular
vacation websites.
You might find
that you haven’t got any data on
some locations for a political scenario
or recommended trip duration.
These
fields might need to be updated
with a “default” value.
Once you have all the data at hand,
you need to come up with a method
to calculate VPS.
An average of
all scores might not work.
You might need to standardise
your numbers.
You might need a
weighted average.
You might need
a regression analysis and so on.
There are plenty of data analysis
techniques which you can work with.
What matters is, how relevant
your list of choices was.
Did your model predict Timbuktu
when actually your choice was
Bora Bora?
Or did you recommend
your friend to go to the Zermatt
when actually she wanted to go to
Miami?
It is quite important to test
the accuracy of your analysis by
working with different data points.
With the advent of Data Science,
almost everyone relies on data
to make better decisions.
To choose an insurance plan,
to find a restaurant, to select an
internet plan, to create a marketing
campaign, every decision maker now
relies on data.
Hence, every domain
requires data scientists and analysts.
In this course, you will learn to
play with data from the field of sports.
We all love to predict the winning team.
Now let us use data to help us
make informed predictions.
We will work on the English Premier
League and help you come up with your
own prediction of a winning team!
Hope you enjoy this course!
