This is our first lecture in this semester. To give you
some overall idea of what data mining is
recently the term of data mining is
changed to what is called predictive
analytics so interchangeably you can
hear that term predictive analytics
which means exactly data mining so how
can we use data mining and the business
domain historically data mining was born
from the computer science field or
discipline and mainly he was used to
discover patterns in data such as fraud
detection or for discovering some type
of pattern in DNA sample data to
discover the reason for disease with
gene for example cause that type of
disease and many other different fields
however for the last couple of years not
longer than three years I would say ten
Amani became part of the business
because businesses started to understand
that they can take advantage of the
knowledge that they can collect from
their customers to better profit their
organization so if you look at the
following picture here imagine those are
customers on Amazon website and as we
all of us know Amazon website sells
product started with books but now there
are many different types of products
including movies and musics and you name
it so in order to build a business
intelligent technology that can
produce some type of profiling for each
customer Amazon actually looks at the
behavior of their customers while
surveying their website at the same time
they look at the historical information
of other customers each individual how
often they for example purchase from
Amazon how much they spend on Amazon
website and also how long they have been
users on the Amazon website and beside
that they collect data about what type
of credit card information the customer
uses also what kind of communication the
customer prefer which type of email is
that is it organizational educational
emails the one that ends with dark or
dot edu or emails that generate generic
type of email such as Gmail and Yahoo or
emails that end with businesses such as
com so all this type of information
already Amazon collecting and each one
of those users has a unique number and
that number represents the profile of
that user for instance I purchased more
than my students books from Amazon so
for Amazon I am a customer who highly
likely to spend more money on their
website and visit the website more often
so when they want to send a promotion if
they notice that I haven't been
purchasing for a period of time the
promotion that they sent me is gonna
different than a similar customer or
maybe a student who hasn't been
purchased recently from
however the amount of purchase that
student make that student makes on
Amazon is not as much as the amount of
purchases I make so the promotion I
would receive from Amazon to encourage
me to come back and purchase more books
or maybe opening a new type of sell
products for example the incorporated
recently as you know the music and Alexa
so the promotion's are going to send to
me is not going to be similar as to the
one they sent to a student who is not so
active as much as me so therefore each
one of us has this number and that
number represent some type of knowledge
regarding the profile of that user over
their website while browsing purchasing
communicating maybe becoming a seller on
their website you name it so the whole
idea of data mining or predictive
analytics in the business domain is that
you want to use all the organizational
structure or the business model changed
in recent years for all the business
organizations in a way that they are
taking advantage of the data available
data of their customers it doesn't
matter what kind of data transactional
data behavioral data profiling data so
that's what they are doing they are
taking this data and they're trying to
use it as a strategic asset to be able
to make some type of business collective
experience that can gather from from the
customers in order to learn from this
data
accordingly based on this data based on
this information about each customer
they want to treat each customer
individually and that what is called the
concept of personalization so eventually
we're gonna end up one day that when you
may be using your computer to browse the
Internet all the sides that you can see
will be totally different than the one I
see now I see from my computer and you
can see that it rained recently so when
you log into Facebook for example or
Instagram or you know any let's say
website bouncer Noble or Amazon the
commercials that you gave the
recommendation for example in Amazon
whatever don't tells you oh that person
who purchased this book similar to the
one you purchased recently also
purchased this book so they give you
recommendations for other type of
products the same happens to me but
however the type of recommendations the
personalization of books that they
recommend to me are different than the
one recommend to you because my history
on Amazon Web site what type of books I
purchased what type of products I
purchased okay so this is the whole idea
eventually that's gonna apply to
everything on the web it's very
expensive to do that only big companies
now are doing this type of activities
Facebook for example you go to Facebook
and you see all those commercials
commercials happening on the next to the
Facebook and pictures of products that
they are related to whatever he did over
the internet over the last couple of
days and those are different
recommendations for purchasing than the
one I see when I go to my Facebook
account for instance so the idea is that
the goal of those businesses in future
to give this personalized experience the
customer so they can
enjoy their browsing behavior for
example and that applies also not only
to the web that applies to install
businesses for instance Best Buy if you
do your purchases and in store and you
go to Best Buy and you purchase some new
products you will notice a couple of
days after that the coupons that comes
to your house address are related
somehow to the product that you purchase
the same if you purchased food from the
Walmart or from Kroger you name it you
get this promotion at the end of you
check out that are related somehow to
the products that you purchase in the
past in addition to some a new product
that you never purchases they try to
tempt you to purchase a new brand that
they think highly likely if you knew
they have them that you would purchase
them and next time was this discount
they offer you so the whole idea is
using the predictive analytics
algorithms we are going to play the role
as an analyst in any business domain and
try to learn from the data we're going
to try to learn and collect all this
experience all this knowledge based on
the cells records
customer profiles when I say the
profiles for example what is when you
filled out an application and Best Buy
you listed there what is your salary if
you didn't sound early at least they ask
you do you own a car so you list it
there so this profile that fits you is
going to dictate what kind of promotions
you're going to get in future but only
that that profile that you filled in and
they compared it with credit-card agency
they know that this is not a - it's true
what you said and they give you a credit
card
Best Buy credit card the amount that you
have on
your best bar high card is different
than the world I have based on my
profile okay so the whole idea what we
are gonna do we're gonna learn you have
data on you're working as a business
analyst in this domain and it's a Best
Buy we became the manager so you're
trying to discover some strategic inside
and business intelligence and in a way
that you can help your business to make
a decision to make a good decision
regarding the the buyers okay and those
decisions cannot be done manually
because you have a hundreds and
thousands and ten thousand sometimes of
customers you need to have a tool you
need to have a software you need to have
a program that can help you to discover
the patterns in your customers and give
you business rules that tell you okay
this customer you can set him this type
of products was for example discount of
10% however that other customer based on
the profile that I see here you should
send him for example this product that
cost $5,000 maybe atv-3 with the 3d
facilities maybe it's a curve TV you
name it and give that person we predict
that if you send that person this
promotion of 10% highly likely that
person is going to come and purchase it
from our store so they know everything
about what we do in the store and over
the web and they keep tracking all our
behaviors and accordingly the
promotion's you get over time are
dictated by this type of programs using
data mining which is predictive
analytics software that helps you to
build models of your customers to take
advantage of them to sell them more
products and not to know
them this is big issue not to load them
to other competitive companies okay you
don't want if you are the manager of
Best Buy you don't want that customer
who regularly comes to your store and he
never purchases TV for the last five
years for example you don't want that
customer to go and buy his or her TV
from a competitor like Circuit City or
maybe go to the mall or I don't know or
online from Amazon you want to take
advantage of that knowing this
information about your customer will
help you to give a very good discount
and specifically to that customer and
you know it's gonna take it and it's
gonna purchase it from you okay now so
the whole idea is you have multiple type
of data that you can use in your
organization and you will have some type
of wisdom you wanna gain wisdom from
this data by creating models and we have
a software we're gonna use called rapid
miner this is the software scanner model
the profiles of customers and I gave me
it's gonna give me decisions and it's
gonna give me business rules that helped
me to take a competitive advantage
against other companies so what type of
data you have the customer profile for
example where that customer lives what
kind of salary he or she has okay so you
know which demo category that person was
his address exactly what his telephone
number and all the information about
that person beside that and this
informations been given when the person
applied for a Best Buy card for example
if you are a manager in a bank that
information will be for a customer who
uses your bank so you have all this
information about that customer if he's
married or not
does he have kids or not so all this
information beside that the cost
behavior is what you collect either by
the activities that customer does when
it comes to your to your organization so
if it's in if you work for Best Buy
you're talking about a user who comes to
the store for example and purchase
equipment from Best Buy if user is
online user also you can have access to
their online activities through what is
called the customer contact logs you can
see this from the browsing behaviors so
if I went to Best Buy website logged
into my account and I started looking
for laptops for example I didn't
purchase I'm just looking at price for
different types of laptops and then I
decided I'd and no I'm not gonna
purchase in two three days what I would
not is that either I will get this hook
this promotion of discount in a similar
type of laptops coming from Best Buy or
I will get an email so that's a type of
a tracking even without purchasing so
the whole idea you are all the
businesses are doing this if you're not
doing that you're gonna end up bankrupt
because if you don't track what's
happening in your customers behind the
same activities then you're now you're
loading those customers to other
companies so the idea is you take
advantage of these data as you create
some type of models and those models
will help you to predict what that
person's gonna do next and here where is
your goal you know that person highly
likely is gonna buy a TV because he
didn't buy TV for the last five years
because what else he was browsing TVs on
Best Buy website this is how you have
this information even you know exactly
what kind of brands he was trying to to
browse so you take advantage of this
information and you send discount to
their added some others or by email
okay so one thing you have to understand
that creating and model is is not
sufficient
after you create a model your role is to
apply this model on the data so you can
predict what's gonna happen okay and
sometimes you use what is called scoring
a customer which means you knew in the
past a similar customer who was browsing
activities like yours in the past what
they ended up doing you have this
information in your data so you apply
similar type of concepts and that what
is called the scoring or scoring the
information about customers that didn't
buy yet for example TV or data that in
the past customers bought TVs with
similar type of profile which means
similar job of salary similar type of
location place similar type of
activities over the web and then you
give this recommendation and you send
the the respond which is in this case
maybe a promotion to that person okay
accordingly when you say that you
applied a customer to a model and then
you scored that person on a model and
you come up with a decision to send a
promotion this is what it is called also
the business logic you are creating some
type of business rule that business rule
is an actionable business rule it could
be such as male Association suggest a
cross sell option oh you are worried
about retention of that person you might
also send him just updating his profile
to a better profile without even letting
him pay extra that happens usually in
telecommunications so you see those
customers at and the internet type of
contracts like the spectrum here and
Molly green so you have those customers
and spectrum is worried notice that some
customers if they don't
they their movies said all the TV shows
that they are watching like the Showtime
and etc and they haven't been active for
a period of time they says that this
customer is gonna maybe leave them and
go have another contract let's say was
AT&T with DIRECTV
so what they do they send this update
feature we we updated your internet
speed from 100 Gig to 200 gig with no no
increase in payment so you the customer
will be happy and they will retain that
customer that kind of business who or
business logic you come up where is
based on the predictive model you built
another examples just to show you how
the model is built we're gonna go
through all this during the semester we
can have real cases so in that I be just
explaining to you what you are doing so
what you know a simple example for
instance that let us assume that we have
each row those is one customer ok
and each column is one attribute or you
call it one variable or you can call it
dependent independent variables how
about that we are trying to project what
that person is gonna purchase based on
what the person purchased in the past
and some profile information
so we have our dependent variables okay
are the number of purchases last
purchases the gender and the income and
the dependent variable the one that we
are trying to find depends on all the
rights for example based on the model
we're gonna build is we can know if
highly likely that person's gonna
purchase what what's that person next
purchase is gonna be based on the
previous purchases this is a simple
example we're gonna see more examples
that really makes sense but in the type
in image it's the following when the
model notice that that person purchased
shoe
in the last purchase it was well he has
high income but highly likely if you
said the promotion for gloves it's gonna
purchase it okay what else for example
here we have a female wear okay she
purchased blocks last time and her
income is medium the model will say how
are you lark is gonna purchase a hat
okay
etc and look at the last one that
doesn't make sense what is the
relationship between cloves and piano
yes you gotta get some rules that don't
make any sense to you what are we gonna
learn how to deal with stuff like this
okay here what you call this type of
information you are a training data and
this type of data is called the flat
table because we only have one data
table we don't have like an access ten
tables related to each other with some
key elements the table is simple this
one table has all the information about
that customer okay and this is what we
are trying to predict this is our
dependent variable all the rest but so
the decision of the gloves depend on
attribute called last purchase okay in
our case the first strong was shoes okay
also the gloves decision depends on if
the gender was male okay and if the
income is high okay
now what I would like to remind you
always that the best business model that
can give recommendation to their
customers is a model that can
distinguish one customer from the other
like all of us should be considered as a
snowflake no one looks like the other so
for example the biggest company in the
world that knows how to treat their
customers knows how to treat customers
individually in a personalized
experience is Amazon because they
started
period of time ago a long time ago there
were the beginners in this devayne which
is called collaborative filtering
techniques this is was my dissociation
actually part of it in that area for my
PhD so my Mazzone is capable of doing
that because I had plenty of historical
data they collected over time and they
took advantage to utilize this
information to keep track of the
customers and provide their customers
with the best and the most individual
experience okay so so if we want to look
we're gonna look at one of those models
in the semester called the decision tree
model for cross-sell what that mean it's
very easy a way to create business rules
so it starts like when you create the
model and the model has a measurement
specific measurement to decide which
attribute among those three we saw here
which one is the most important to
decide what the customer is gonna buy
next okay and there is a measurement so
for example software access use a
specific type the log profit type of
measurement rapid minor use different
type of measurement to decide which is
the purity of the division and you can
use both either the purity or the noise
in the data okay we're going to talk
about this when we touch base on
decision tree in general just as I said
this is the overall showing you what the
meaning of data mining so you're gonna
use data mining techniques which is
called machine learning techniques on
our algorithms to help you you give the
data to that to that machine learning
algorithm in this case this entry and
the decision tree look into the data and
find that the best way to know
what the person's gonna do next or what
the person's gonna purchase next in this
case is to start to ask the question did
he buy shoes or not so if the person
bought the shoes okay and was female
highly likely the next step to buy
either a head or gloves okay now look at
that if the person really lost purchase
was shoes and the person purchased shoes
and had high income doesn't matter if
female or not female and had high income
highly likely that person will purchase
shoes or piano which is no relationship
but that's what the model found us okay
this is one of the most important graph
that you can see at the end of the
semester the last months of the semester
in October and accordingly you can use
it to understand your outcome of the
outcome of your project so imagine the
following you have a company that you
are the analyst in this company okay let
us make it more simpler imagine that you
are the analyst for Best Buy and Best
Buy decided to to create a campaign for
promotion campaign or let's say sending
coupons campaign to attract customers
during the Black Friday okay
Thanksgiving is coming so let's say
that's the business problem you are
solving here you're the CEO of your
company gave you this data about
customers behavior last year and it's
telling you here is what those customer
have been doing over the last year and I
want you to make
me a decision that I want to send only
the most important customer only 10% of
the customers in our store so you can
imagine like Best Buy maybe they have
let's say 20 sell or respect these
10,000 customers okay you only have to
find the thousand ten percent thousand
customers from those customers that you
have a profile and behavior information
about them and activities over the last
year your goal is to find the 10 top 10%
of those customer that they are the most
important
budget-wise and they're gonna spend the
most money and they're gonna really
respond to your campaign which is
promotion campaign you're sending let's
say 10% coupon and only for those you
don't want to send 10% coupon for all
your customer so if you say 10% coupon
for 10,000 customers you're gonna lose
money because some of those are not
gonna purchase big products he sent 10%
for something really simple small you're
not gonna take advantage of the whole
idea of Black Friday people's gonna
purchase huge product that are expensive
that you can make a profit on them so
this is what is called the roc-roc
function and we're gonna see it during
the semester and if you want to
understand what's going on in this
function is the following as you can see
the x-axis is the percent of customers
contacted so how many ARMs of all your
customers you want only the 10% so you
cannot take only the 10% those are the
10% customer that you want to contact
okay
and now on the y-axis
this is a profit this is the profit that
last year you made according to those
customers and you ordered the customers
and descending order based on less
importance that means those customers
their numbers are starting from zero to
twenty five percent are the most
important customers compared to the rest
okay so they are ordered in descending
order of importance okay so now we know
that those customers in those area and
you want ten percent that means I want
those customers in this area here I want
to find them and I want to know based on
the ranking of my predictive model how
much profit the company is gonna make
according to this model as you can see
the red line on this graph here is what
is called the random sample so if you do
not apply any any algorithm from the
tools that we have a rapid miner you use
just a flipping of a coin home to pick
among your customers look what's gonna
happen you have to remember what you're
sending you are selling discounts or you
are losing profit you know you are
losing money to take advantage of your
customer so technically based on the
graph here in front of us if you send
those promotions to all your customers
the hundred percent the ten thousand
customers you can lose about almost six
hundred thousand dollars by not using a
predictive model to help you pick the
most important 10 percent customers okay
that's the whole secret about the
importance of building a
addictive model to detect and discover
knowledge from your customer data to
find the one that they are going to
purchase based on the promotion that you
send and not only purchase they're gonna
make big purchases that you're gonna
benefit the most from so let us go back
here so you went ahead and you use the
data mining algorithms we're going to
talk about and you build a model maybe
decision tree model and based on that
model you come up with this chart so
this is the predictive model output and
not the random sampling on output okay
so if you send imagine each dot and this
craft is about five percent of the
customers okay so this is the first five
percent the most important five percent
of your customers if you only send to
five percent of your customers in our
case five hundred customers as we said
we have about thousand if ten percent
that mean we have if you pick the most
important 500 customers that visit your
store you're gonna make a profit
$100,000 that's what the model is saying
if you send to the tape percent of your
customers to those tapers sent the top
most important customers if you send
them those propulsions
you are gonna end up profiting an amount
of $200,000 double the amount now if you
send to the top 15 percent of your
customers you're gonna profit almost
let's say 300 about two hundred eighty
thousand not only good if you send you
the
20% of your customers you're gonna
profit almost 300 maybe $40,000 that's
the closest to this index value and look
what's happening after the 20% you can
see the profit now is going down so if
you send to the 25 you you were and a
better option if you send only to 20 now
if you send the 30 also you started to
get less profit 35 less profit against
one if you're gonna send the 50 percent
of your customers over here you can end
up almost having just one third of the
profit so if you send it promotion to 50
percent of your customers we said we
have 10,000 you can say if you decided
to send coupons you shall send coupons
to 5,000 customers that are you have the
profile in the store you will get one
third of the profit if you only send to
the top 20% that's the secret of
predictive modeling just knowing who are
those customers at the big deal who are
those customers that if you send them
this specific promotion during that
specific time of the year based on their
previous historical information that you
know about them you're gonna have a
profit however if you send to those
promotion not to the one that you are
randomly you pick you're losing all the
time because you don't know who is
important who's not randomly you're
picking those customers so that the idea
of the whole concept of applying data
mining algorithms machine learning
algorithm into business data the idea is
to know how to use this information to
realize to discover the most important
rules the most important customers based
on the rule that you are applying to pay
to make a bigger profit okay so we're
going to go into those what is called
the left short very similar concept and
at the end of the semester so don't
bother understanding this short only
focus on this one if you understood from
my video the meaning of this char that's
all I want you to take away for the
first class in the semester now when we
say business rules what do we mean by
business rules okay
so you might have a new customer you
don't know what kind of activities that
customers or that customer is gonna or
what kind of purchases gonna make so
here is one example of business rule new
customers who come to the website let's
say website of Best Buy
okay so new customers who come to the
website of organic search results
okay what's the mean of organic search
is all that they mean that's mean the
wine may be to Google search engine and
they were searching for a product or
they went to Yahoo or Bing you name it
so if they didn't go directly to your
web site to their own company website so
the business rule following says those
customers who are searching for product
on Google or similar search engines by
more than $150 on their first
transaction on your website that's what
the data is telling you and they are
male and they have an email address that
end with dotnet not only that they are
three times as likely to be return
customers so it's you go now so this is
your predictor
model is doing is giving you a gold
information this is this is a golden
rule for you now to look at those
transactions who are those customers in
new customers we don't have them in our
report repository we don't have
transactions with them they just used
the Google search engine to purchase
that say TV york or laptop and they land
it on our website based by that website
they never they don't have a even a
credit card who is Best Buy they ended
up purchasing the first time and paying
150 dollars and not only that their
email address is with extension dotnet
and only that honey likely that they can
return okay so this is one of the types
of predictive model is gonna give you
from your data now that we we have an
overall idea of what we can do with data
money just giving you the sense of the
type of project we can do at the end of
the semester or exercises you can I do
through the semester if you want online
student let me give you some ideas about
what the meaning of business analytics
business analytics can be divided into
three different domains the descriptive
analytics and that's the part that
covered by your introduction to data
analytics BD and 310 all you did is some
type of statistics you summarize the
information of your data you created
histograms you looked at the frequency
tables you come come up with some
hypothesis testing and you looked and
decided if it was true or untrue and you
decided to carry on with your next step
with the data so it's more
descriptive summary of your data that
you were dealing with this course the
data mining course is is consider of
falls under what is called predictive
analytics so the predictive analytics
has many different names classification
or regression
timeseriesforecasting you name it so
that's a time that you gonna do by the
end of the semester
now we're gonna touch base on what is
called prescriptive analytics that's the
role of a businessman who got the
information from the predictive modeling
and now its job to apply it enhance it
optimize it and make it as part of the
business model so during the course of
data mining you gonna see the predictive
model output now since you don't work in
an organization to use this information
you won't be able to apply it to a
real-life scenario but if you were and
you have real data from the business or
the organization you're working in right
now you might consider using a data from
your organization applying the
predictive model creating a model and
then going back to your to your boss and
telling him I have this model let us
apply it automatically on our data so in
future I don't have to run this
algorithm again all I have to do is to
see how the models gonna score on new
customers and then you can tweak it over
and over again okay so what type if
you're gonna look at those three types
of analytical models as we say the
descriptive and predictive and
prescriptive each one of them has
different type of questions that Karen
answers
also it has different type of techniques
and technologies that you can use to
apply to this type of analytical domain
so for example in the descriptive
analytics standard reports that you can
run maybe from access or you can create
a dashboards like we did last semester
from using jump or from excel as a
summary of the data and what kind of
questions you can you can ask on this
type of data how am i doing how my
business is doing why is it happening
why the profit this year maybe is half
than last year let me look at the data
let me let summarize it visualize it and
see what's going on which type of
customers you are losing so what L who
is involved in it who is involved in the
process that maybe we are losing data
losing customers so maybe you have a
branch is your business has multiple
branches maybe your your main branch is
doing great but branches in different
locations or not or some specific
location is not so you have to look and
discover and find it and this make a
decision business decision to close it
for example now in the predictive
analytics domain where you use here we
go we are using data mining or text
mining forecasting and you have some
type of upper level statistical analysis
what you are doing here is what else is
most likely to happen you are predicting
you are forecasting how else will it
happen okay how long will it continue to
happen so you might be able to survive
the the less profit that you are making
this year compared to the previous year
but how long you cannot survive it okay
are you gonna be capable of paying off
the
the amount of of money you put on your
production on requiring
and getting the product to your store
you know so you have to add that
question
finally the prescriptive analytics here
where as I said you optimize you apply
optimize and you make a decision how can
the best be realized what all is
involved in this happening so you're
gonna set all those brushes what is the
best that can happen okay so you are
trying to optimize and get much better
profit so for instance in this graph I
showed you here if your boss said go and
give me 10% of my customers of our
customers that we can send promotion to
so we can take advantage of a black
friday you might come back and tell your
boss actually if you only consider 10%
we're gonna have 10 200,000 profit
however I would advise you to have 20%
of those customers to send the promotion
because at that case you will have at
least one time like one third more if
it's 200 almost three hundred plus and
thirty you can have more even I would
say like 50% more than 50% more profit
than sending only four chambers for 410
yeah so so let's define at this point by
50% sell it so that's it so you would
convince so when you are trying to do
that what you are trying to do in this
domain over here you are trying to make
some heuristic decision based on what
the data is telling you to convince to
convince your boss that we are better
off we will make more profit if we
optimize our model for sending promotion
instead of sending to 10% of our
customers let us say
for example to 20% of 25% of our
customers based on the model you have
okay so the whole idea is you are trying
in data mining to get as much as data as
you want and you have and take advantage
of it to summarize it to come as an
information for you accordingly you're
gonna make some type of relevant and
actionable decision with this data and
that what is called the knowledge you
get from the data the knowledge you get
from the information you have by
summarizing it and creating some type of
business rule that you can get wisdom
about what their customers are doing or
what they are going to do okay any data
mining project in any organization in
the world
involve the following type of domains or
disciplines and as you can see data
mining is multiple multidisciplinary
approach to knowledge discovery we are
want to discover information and
knowledge about our customers using
statistics using artificial intelligence
I'm going to talk about those using
machine learning and pattern recognition
using information visualization such as
dashboards for example also how we got
the data from the first place is from
using the database management and data
warehousing that we have information
this data could be locally like Best Buy
have data of their local customers but
also they have data about the regional
customers also they have data about
their website customers all this type of
different management systems to help us
to dig into this data and what kind of
information or systems we use we don't
know maybe they are using Oracle maybe
they are using sequel maybe they are
using my sequel behind the scene and
therefore they are involved
of in what is called a management
science discipline and information
system discipline that take care of that
part of the data housing ok so here is a
bigger picture of what data money on its
own can do in general so you have a
three different type of domains of data
mining you have the prediction here
where you're gonna predict what the
customer gonna buy next okay and you use
many different type of algorithms
there's something else there's
Association rules another domain for
example the most popular Association
rule which is called market basket
analysis rule that tells you very
popular in the data mining when beta
money started to appear in the business
ward like 20 years ago but now last
three four five years ago became very
easy to every business even small
businesses to apply it so the old very
old saying that a customer who goes on
Sunday night a male customer let me
refine it a male customer who goes to
Walmart on Sunday night purchase diaper
and milk highly likely will purchase
beer as well so you can guess the
scenario so they found this pattern in
their data one more that on Sunday night
guys are going the wife's and they are
having a new baby
the wife sending the husband two or more
to buy diapers and milk you know before
starting on that of the weekdays and of
course the guy is really overwhelmed
there was all this baby issues and
crying so here we go he got a beer with
him and went home
so Walmart took advantage of that
concept that they found this pattern
in the data which is called association
rules also it's called market basket
analysis and what they did they started
to could the beer so far away from the
diapers and the milk saw the mail going
on Sunday night to buy those items until
he goes to buy the beer
he's gonna get tempted with all the
different products walking from far far
distance to another alley where the beer
is located in this way they can he can
be attracted by other products to
purchase that was the way they profiting
from learning about this association
rule or Association business rule last
one is what is called clustering
concering is another type of data
mourning that falls under what is called
and supervised learning and we get a
touch base on that clustering is a way
where you want to segment people
together so we know the concept of
clustering in biology and it came from
the biology actually so you heard about
the clusters of
of ends they clustered together and the
the honeybees they cluster they create
cluster similarity between them so we're
going to use those concept to apply them
to business data to find profile of
people that they are similar to each
other this is what Amazon is doing so
when Amazon give you an advice of
purchase that's for those people who
purchased what you purchased in the past
they also purchased those new products
that you've never thought of purchasing
them or you've never seen them before
but now what Amazon is doing is looking
at your profile is looking at who are
those cluster of people similar to your
profile and say hi hello here we go
let's give that person similar product
that those people purchased beside
the one he did already and that's a
concept of clustering also there is a
application of clustering on outliers
and we talked about it
now I just wanted to make you aware that
there are many different software's in
the area of data mining and many
complicated and many simplest stick type
of programs or programming languages
however every year there is a big pool
of the most intelligent business
organizers that they run and they do
voting for all the businesses in the
world and actually looked at the people
opinion about products that use or
software use to apply data mining
applications of course you want
application that they are sophisticated
that they are capable offering new
algorithm that is friendly has
user-friendly interface not so heavy of
programming and most importantly it is
accurate you want an accurate product so
those who you use and you noted that
they are giving you some business rules
that not make sense at all so that's why
the ranking of those would go low so
look at the orders of those products
number one and that was the pool of last
year
number one is rapid miner this is the
software we're going to use in this
course as I mentioned on the
introduction of this unit that this
software cost $10,000 per year per user
for non educational entity since you are
a student and I am a faculty we both
have access to that software for free
for as money as much as you want to use
it for
many years as you want to use it as long
as you're using your educational email
to subscribe to the softener I don't say
you have to renew the license every year
because also they've come up with a
newer version however as you can see
here this is the highest use now as you
can see 35% of the users that they say
that they are using rapid monitor the
most we can see are you can see Excel we
can see jump over here and many more so
you are lucky that you can use the most
used and acknowledged software developed
in Germany and they have the head hawk
quarter here in Boston it's really good
software also it has a certificate
online certificate so maybe by the end
of the year trying to apply to this
certificate online and add it to your
resume as a skill and it's really very
valiant skill so the idea of building a
complete solution business solution goes
into many different stages so before
even applying any data mining algorithms
as you can see in the stage over here
there is what is called the knowledge
discovery of databases process kdd so
the first letter abbreviation came from
the K and so the Canada process all
businesses are aware of that process so
whenever you have data of your customers
of your business you want to discover
knowledge from it it has to go through
the following the part of analysis you
start with the raw data that mean data
that not related to each other just has
information maybe an excel file then you
gather them together you still like them
maybe you use some sequel or arc Oracle
or my sequel you name it to to to put it
in a format that you can
why's it in you know software that can
allow you to prepare the data the way
you want it now of course you have to do
data cleaning because a lot of the time
those information you get about your
customers over the web might have some
information that are missing some
information entered wrong a lot of
people and surveys or fill forms with
mistakes so you have to take care of all
this otherwise you cannot try the data
mining algorithm then you go into what
is called the pre-processing data with
the data cleaning and filtering then you
go to transforming the data in which way
you convert it in to flat tables that
you can utilize it for running the
algorithm on it ok and that doesn't mean
to be doesn't have to be with rapid
miner it could be those are could be
with Python you name it so but the
process all this process goes exactly
the same no matter what the stock you're
gonna use it stays over here and finally
you're trying to find the patterns and
knowledge from this data and here where
you are able to get this knowledge and
to use it to benefit your business now I
should mention here that there are many
different type of schema
to apply data mining ok or the knowledge
discovery in your data we're gonna see
next class something called the Chris
model but the same idea applies the one
mentioned in this graph and I felt I
have to put it here because you might
consider taking a class and sense in
econ Department so sense also is a
software that you can run data mining
algorithms machine learning algorithm on
it so if you ended up working with SAS
the the knowledge discovery that you see
here it goes into what is called the
SEMA architecture and the schema of of
SEMA is just the following you
start first with your data sample of
your data okay and then you try to
explore that data visualize the data and
it says maybe in SAS Enterprise guide or
SAS Enterprise miner after that you try
to redefine transform the data to make
it in a way that you can utilize it for
example maybe you have gender some data
entries of some data entries female
complete ward some data entry male
complete ward some data entry M so you
want to be consistent so you're gonna
transform all this data whatever it says
female is gonna be F as well whatever is
Mel's gonna be M so now you end up with
all the data has only those two options
F or M so this is the type of
transforming I'm talking about and there
are many different types binning for
example is transforming then you're
gonna create it tomorrow and then you're
gonna assess the model you have to
assess the model and you start it's a
cycle as you can see and the model we're
gonna use in this course the term is
called the Cris model similar to this
again it's continuous process you don't
stop your assess you find the model then
you might get another sample and you
repeat okay
so data and data mining in general can
be divided into two types of data and
that's the beauty of data mining
historically in statistics you only used
the numbers or categorical type of data
to to analyze data however with data
mining you can use twitter data which is
text you can use facebook forums which
is text you can use whatever you want so
that's what is called unstructured data
such as images audio video okay text
HTML xml so this is a new domain now
that merged that why businesses are more
and more involved in data mining because
information now are coming not in the
regular statistical form of the
we are used to have numerical and
categorical to run expenses for example
or to run strata if you're using started
on our or to run pattern the new error
now you are using unstructured data
textual data images videos and etc and
Python are now is capable most of them
both programming languages are capable
of handling also instructional data ok
so here is a summary of what you do
really in any predictive model you start
by data you pre process your data you
clean your filter you decide what you're
gonna use from your data then you take
2/3 of this data you train it to create
a model and one third you hide the
predictive model there and you call
testing data you apply the data on the
model and you try to see how well this
model that you created by the training
data did well then in future where a new
data comes that doesn't have the highly
likely that person you're gonna it's
gonna purchase for example gloves or
hats etc or piano you don't know what
that person's gonna do but you have a
model but in the past note is that model
that those people who are high they have
higher salaries and they purchased for
example clothes they ended up purchasing
for example piano ok so that's the whole
idea and again this is just a simple
intro we're gonna learn how to do all
these now when you build your model this
is again another ROC curve that I showed
you before about the profit you might
end up having more than one model so as
you can see here we have model a we have
model B we have model C which is so this
is random sampling so you have to decide
which one to pick and we're gonna go and
understand why for example a is better
than B okay so just remember always
the higher the pump is the better the
model is as we you can get more deciding
which customers you're gonna take
advantage of at the beginning and with
the highest profit that's the whole idea
okay so in general that's the concept of
if you wanna find an optimal and
determine the solution of the process
based on some type of metric of a value
of K so how many times you're gonna
repeat that process so you have a story
called data you cut it between training
and validation or testing as we said the
time being considered is like testing so
you do that and you you find the output
you have a model you keep optimizing
until the model gets better and better
you keep repeating iterating iterating
and then finally when you reach a point
you say I'm happy I'm satisfied with the
accuracy and the level of precision my
model is producing then you get new data
and you apply to the model and that data
will tell you the highly likely what
that customer is gonna do okay now in
general that's the same idea how you're
gonna do it applied for for example a
regression model
yeah it's the same idea you have the
data you can assess the data first to
see how well the data is either by
looking at the correlation matrix you
gonna create some scatter plot between
the variables in that data then you're
gonna create you transform this data and
you're gonna try to fit it on a model
yeah and then you're gonna assess again
and try to deploy it this is the this is
the type of deployment I was telling you
about that if you were for Best Buy so
they you do all this prediction but
finally you have to apply the model or
real customers nuclear customers coming
to the store for instance give you an
example let's say you don't have a Best
Buy card you walked into Best Buy and
decided to
you want to open the best best boy
credit line
you went there they told you okay so
just fall from us and come back after
ten minutes so what they are really
doing they taking your information plug
in the information you entered your
location how much your salary do you own
a car or not you own your own house etc
and then this information is considered
like here you know this is your new data
and they have plenty of they have a
model that related to plenty of other
customers similar or not to your profile
okay and based on what happened to those
customers in the past the model is
saying that customer who those customers
or let's say 80% of those customers that
they have a similar profile like you
they default it so let's say you profile
you don't own a car you don't own a
house and you don't have a salary or
just a student and you're asking for
$10,000 credit line or 5,000 of course
they get on all this data and what's
gonna happen today the program will tell
them we looked at the data historically
we gave credit lines this type of people
and 80% of the time they defaulted they
didn't pay back what they own us so you
will come up they will come up with your
rejections they're gonna reject your
your request or sometimes what happened
they would say yeah we'll accept your
request but we're only gonna give you
credit line of $500 because they have
information historical data about
customers similar to your profile with
credit line 500 between maybe 1500 and
the number of default was maybe 10% only
so they will take the risk and they will
give you if I find it okay but 90% of
them they didn't default so that's a
good outcome so beside that so this is
how we're gonna look at the regression
model they want to know how much money
they're gonna give you here so here it's
not decision only to give you or not
give you
credit line here decision is to give you
but with different amount okay
now finally this is the new domain
started to appear and data money which
is called text analytics that involved
you have plenty of document document
Facebook your browsing behavior on
Facebook whatever you wrote there's all
text also for example search engine
whatever you search on Google search
engine can be used tags whatever
pictures you tagged on your Facebook all
this can be utilized also to browse your
behaviors and what kind of data mining
algorithms or he would say text mining
algorithm we use related to this area of
research it's very similar to the data
mining to use machine learning and many
different disciplines are involved the
Cystic artificial intelligence computer
science management science and many more
data about disciplines okay so simply
put this is what you really you're doing
you're trying to get information no
matter what kind of information about
your customer may be structured data
like databases like Excel access from
Oracle from sequel or could be
unstructured data as we saw maybe from
the web blogs maybe your Twitter
information etc so beside that you have
some tools and techniques you can use
for extracting knowledge from the text
now and you need some how-to main
expertise those people who get
information from Twitter they know that
how to Twitter function so they are
expert about if you retweet it it's
different than when you write a new
tweet okay that's what it means all this
information the software information
privacy issues linguistic information
about the language maybe you're using
different language than the English all
this is gonna give you context specific
knowledge about your customers who are
tweeting me
we about your product or writing some
comments on the Facebook of your company
okay so in general and is getting more
and more complex as you can see you
might have document collections text
color collection established from your
organization you also you get and you
have a complete data structure
unstructured data you're gonna convert
it into structured data to extract
knowledge and then to be able to convert
text into what is called knowledge as
regular data that we are familiar with
so you're gonna convert any type of text
data from Twitter or Facebook you name
it into data tables that has information
about those texts and this is how it
looks so if you have a text mining
algorithm that takes money algorithm for
example looks at the document one mind
and document one was first tweet you
sent today in that first tweet you
mentioned the word investment risk
one-time okay you did not mention the
word project management or software
engineering
you mentioned development one time and
that's it okay
now maybe another customer from the
other type of purchases it purchased
something and then tweeted something
about our products that product is
related to project management so in his
tweet he mentioned the word project
management one time and that's it
another customer maybe tweeted something
related to our software engineering
domain and he mentioned the world
software engineering three times in
desert wheat and he would wish in the
world SAP one time consider since the
idea you are converting terms that used
in tweets in our case documents into and
to a regular table that we are familiar
with that we couldn't run any
statistical if I put this data on John
John's kind of share for me for example
five times the world software
engineering win
being used today on tweet the world for
example investment risk only mentioned
it twice okay etc so here some examples
just so you can imagine we have Excel
data that has information about some
specific journals the roll-aboard one is
the header we have the idea of that
journal we have which year that journal
has this article in it and then we have
the name of the journal and then the
abstract what the person wrote in that
article so this is text data our goal is
to maybe mind the data sorry in mind the
data in the text and understand what is
the most used topic during the year of
between 1999 and 2005 for example just
to know topic extraction for example so
here is the big wide range of sources in
Big Data and we gonna hear a lot the
world big data big data started to
appear her as a term in the business and
in the market a couple of years ago not
so far a long time ago because now we
have resources and platform that can
handle big data and we're going to get
into the specific specifically of the
big data big data you cannot say you
have big data unless you have something
called volume you have a volume huge
amount of data and also you have variety
of type of data not only numerical and
tillich Oracle you have video audio text
structure and instruction that's the
meaning and also the velocity of the
data so the Big Data whenever you're
going to talk about Big Data
you have to mention those three type of
information the volume of the data it's
big variety and there are different
types and formats of the data
and the velocity of the data so you can
see here where each one of those Falls
for example if
the volume is high and the variety of
velocity is high these are the the wide
range of sources associated with data of
big data for example you will see
sensors and our simple sensors or more
for example the the nest the ring for
example cameras now know every corner of
your house as you can sensor movement
and sent your phone automatically all of
this became became possible because of
the revolution of the big data and the
community of doing some type of data
mining saving and filtering and managing
all this type of information as you can
see the medium area is dealing with
video audio tweets blogs etc where the
old-fashioned way business a process
where you save data locally or on the
cloud you name it
so this is the last slide in this video
you can talk about how you can leverage
big data and analytics for political
campaigns this is just an example and
actually with the first time ever big
data was utilized with data mining
during Obama first term so actually he
hired a company a data mining company
from Chicago they were the one targeting
all the tweet all the Facebook post all
the information over the web all the
data repositories related to to voters
in the past all the time information to
be able to detect and know exactly which
houses that they needed to go and visit
the last week and to be guaranteed that
those houses get a vote for him it
worked so if you google a little bit
about Obama campaign
first term and the word data mining you
get into a lot of algorithms being
published over the way about how it
happened so input data sources what kind
of thing of sources they used send all
the census data from population specific
age race sex income and free location in
u.s. they also use data from the
election databases okay the historical
data the party affiliation previous
election outcome trends and distribution
of all this market research they look at
the polls are at the trend and the
movement over time before even you know
considering any movement in the election
social media they collected data
Facebook Twitter Linkedin news groups
blogs you name it web pages of course
and many many other type of data sources
all this with the idea is to create a
data mining platform that utilizes big
data and use the analytic type of
processing machine learning algorithm to
predict the outcome and trends okay
to identify association between events
and outcomes to assess and measure the
sentiment of people on topics so this is
how we decided which topic dimensions
dimension and which topics not to
mention during the his speeches and
based on the location of his species
profiling clustering those group of
people based on similarity of behavior
and accordingly finding those patterns
and targeting those voters with message
that can overlap to what they have been
doing over the internet and knowing
exactly what they wanna hear so they use
data mining web money takes money
multimedia mining algorithms the output
which was the goal raised money
contributions increased number of voter
volunteers organized movement create a
sense of urge
see mobilize voters to get out and vote
and definitely anything that can help
the political campaign to succeed during
the process and beside that the lesson
learned from the first term was utilized
again over and over for the second term
so and continue to influence what we say
data mining is not one time it's a
continuous process
that's not for overview next classes are
going to be much easier we just give you
here an overall idea what you can do
with data mining it's really a very
dangerous weapon to use and if you're
good at it
you are guaranteed to have very good
appealing an interesting job okay so
we'll see you in the next video
