Hey guys, it's Navjyotsinh Jadeja
and welcome to today's lecture today we are gonna talk on data warehouse introduction.
It is part of data mining tutorial series
and data warehouse is a very important concept in order to understand how the data mining works.
So let us see the definition in detail.
So this definition is actually taken from the book which is Building the data warehouse by W. H. Inmon
and and it is very simple yet very efficient
definition that's why I've taken this. This is the same which I learned when I was doing my engineering.
So what is the data warehouse? Data warehouse is subject oriented,
integrated, non-volatile, time variant, collection of data
which actually plays a part
or supports the management decision that plays the base
to the DSS which is Decision Support System.
Now, what is subject oriented? Subject oriented means
It is specific to type of the data which we are looking for.
So if you are in the e-commerce looking for the buying behavior,
then that is the subject. Integrated because it is coming from multiple sources.
It might be coming from the you know spreadsheets.
It might be coming from the account sheets, databases or even the hardbound data,
which is stored in the you know,
paper bounded copies. Non-volatile because it stays for a very long time.
So it's not a volatile form of a data. It comes from different time
you know spaces that is why time variant because when I say time variant,
I mean that different time in the longitude of the data
because data warehouse stores the data
for a very long period of time
so it is up your time variant data and it is collection of multiple types of data.
So let us see, you know what exactly it means. It means that we are talking
about a central location where different databases which are part of the organization.
They consolidate into one form
and has stored in there and it is not the same as your day-to-day transaction.
So I like to clarify what the students as well as people who are understanding.
This is not database which is used in the day-to-day transactions.
And that is known as OLTP. OLTP is online transaction processing.
It is not this it is OLAP which is online analytical processing
so they are multiple sources
and both of them are separate and this is stored in a different place central location again,
it is not accessed by everyone.
So only limited people get to access it
and very important thing which I need to mention is that this is not updated every day.
It is updated over a period of time with the you know,
large amount of data and this diagram actually helps
you understand what it actually means,
you know, because as you can see in here we have the operational data,
which is our Day-to-day data, which is fed into the system where extraction, transformation
and loading is done into the data warehouse.
Then this data warehouse actually performs OLAP,
which is online analytics and this analytics goes to the end user the business end users,
which is in the form of business intelligence
or is in the form of where they can use it
for the decision-making and why is Data Warehouse needed
because data comes from multiple sources
and in multiple formats we need to be using Data Warehouse
because otherwise it cannot be directly used
for visualization or can be fed into a system like you know data mining systems like WEKA
or any other system. So we need data warehouse where integration happens, processing happens
and then it is fed for the visualization.
So that is it for the data warehouse keep watching the other videos
for other concepts I hope you liked our effort
and if you like it please like share
and subscribe our channel. Also you can refer to edtechnology.in
for the same resources. Thank you
