Before we get into data mining and what it is, let's look at where it's actually used in the business context
We'll start by thinking about on-ground behavior. Data mining starts when you collect data on
customers or users or someone else's behavior. Now, it doesn't have to be people - it can also be transactions or anything else
What application is deciding whether to give out a loan?
banks and financial institutions
use either business rules or more advanced data mining rules to help them detect the people
who are more likely to actually return the loan
And of course we have direct marketing which is
deciding who to send an offer to. Marketers are probably the earlier adopters of data mining methods in the business world,
and the reason they use it in direct marketing is, for example, to detect who are the people most likely
to respond to our offer and accept it.
Since sending out mailers can be expensive -- or if you're thinking of longer-term --
sending out an email even might not be expensive, but might be irritating to your customers,
you definitely want to have useful rules to detect the people most likely to accept the offer.
How about personalized coupons?
When you just get to the cashier and after paying they see what you bought and you get a personalized offer
now in some places you can actually scan your loyalty card somewhere in the store itself and
get a personalized coupon before you even reach the cash register.
Other examples:
Accenture, the large consulting firm, tells the story of using predictive analytics for detecting fraudulent claims;
And finally predictive pricing
Thinking about who would be willing to pay what, when,
based on what they did thus far is another application of their mining that you'll see in practice.
There are also some more controversial uses of data mining by businesses
Credit card companies, of course, use your data to determine, for instance
what level of credit to give you. They also use it for fraud detection
when they see some transactions that seem to be strange, they might alert you or even stop the card.
But American Express took data mining one step further - and let me read this:
"The company has been looking at home prices in your area
the type of mortgage lender you're using, and whether small business card customers work in an industry under siege
It has also been looking at how you spend your money
Searching for patterns or similarities to other customers who have trouble paying their bills"
So now
American Express is using your behavior and
comparing it to other behaviors in your vicinity in order to detect
people who are going to not pay their bill and that caused a lot of controversy.
An even more recent controversial use of data mining was Target, the mega store
and what Target did was actually even more outrageous
their detective data mining techniques were able to figure out whether a customer was pregnant or not
and not only that, they could detect what stage of pregnancy the customer was in.
Based on this they actually send personalized offers.
Now, not all customers want
businesses - or maybe no one else - to know that they're pregnant.
Now let's look at the online world in the sense that we're looking at digitized footprints that are now more and more available to companies
How do they use data mining to take advantage of those data?
So you've all used either amazon.com or probably flipkart.com if you're in India and when you buy on
websites like this, there is something called
"Recommendations" on the side: "Customers Who Bought This Also Bought" - and they give you a list.
Depending on your own profile, you're probably getting a different set of recommendations from someone else.
How does the software know what to offer you?
Well, they're using data mining in the background to look at your behaviors and compare them to others.
Even if you're just filling in an online forum for an insurance claim or for some other service
you might be getting a personalized response and the response might be automated.
How is that happening?
Once again, they're collecting your data and comparing that to others' in order to give you a personalized offer.
Personalization is really the key of data mining. As opposed to the older statistical analysis which looked at overall trends,
personalization means that we're looking at the individual transaction or person or user level, and giving them what they probably
require or need or want.
We have things like Pandora
which offer us personalized radio stations where they build a particular set of songs based on your preferences.
There are also websites like Songkick that actually give you alerts about concerts that you're most likely to prefer in your area.
Facebook makes very large use of your personal data, for example,
by keeping track of what users like around the web, Facebook can show people ads that will be the most interesting to them and
generate more revenue.
Google Ads is obviously the major personalizer in terms of the ad space so using your personal traces is
very useful in trying to customize the ads to show to a particular user.
Google even does that when you're reading your Gmail! You might be seeing personalized ads on the right hand side
that are also related to the content of what you're reading in your email
Netflix and other entertainment services also
try and personalize. They have
recommendation systems similar to the ones that we saw for Amazon and for Flipkart. When you watch movies and then rate them
using your ratings which indicate your preference, Netflix will come and try and offer you - "here are some other movies that you might enjoy".
And finally we've moved to the era where even our most intimate
transactions are
moving online and online dating sites use data mining methods to try and
make the matches that are going to be most likely to succeed.
Now: not the businesses use data mining.
Governments, and other types of organizations have also realized that they do have lots of information on users.
Governments: Obama being a very salient user of
digitized data - has been using data for trying to get reelected.
Other examples:
the IRS who collects taxes
report that they actually use data mining as well to determine who really warrants an audit.
In the United States, everyone files taxes, but only a sample of people are audited by the IRS.
Obviously, the IRS has
certain types of people or businesses that they would rather find if they made fraudulent claims.
Another example is the police
And in England the police is using predictive analytics to head off crime.
What they're trying to do is trying to predict a crime
before it actually occurs. So using data mining and patterns about previous crimes,
they're trying to detect whether a crime is going to occur and then perhaps put in more policemen or women in this place
or prepare accordingly. Data mining is also used for crime in Florida: what they do is try to identify at-risk
youth by looking at data on youth.
In particular, what they're doing - let me read from down here:
"Department of Juvenile Justice
is touting a new system of predictive analytics that would steer at-risk juveniles to
specific treatment programs designed to keep them from becoming adult criminals."
So these are all uses of data mining for a proactive rather than detective mode of operation.
And of course once we've moved to digital, we should also be talking about mobile data that provide a lot of information.
And there's been a company called Xtract who's been working with Nokia.
"Xtract which provides social intelligence-powered predictive analytics technologies and solutions, is
teaming up with telecom services enabler, Nokia Siemens Networks,
to offer deliver a predictive analytic solution to mobile operators globally"
What are they offering? Take a look at the next paragraph. "The companies say the solution will offer advanced capabilities
for analyzing and segmenting customers - helping operators to reduce churn and target their marketing campaigns more effectively."
So this company is now helping the telecoms use data that are not only the data in their databases from calls
but perhaps additional data that's coming from the mobile phones such as location data.
If we're talking about location data then of course Foursquare is another example that collects specific
location data when people reach a certain location and "log in".
When people think about business analytics, they often think about reducing costs, making more money,
and so on. Of course
we saw the government examples that some of them have nothing to do with money, such as predicting crime.
But let me give you just a few other examples
that have nothing to do with money so that you really expand your brain and not only think about "let's make more money"
or "let's reduce costs".
So for example, now that there's a lot more online learning - such as this course - there's a lot more information about how
people learn online.
"The University of Illinois Springfield will participate in the predictive analytics reporting framework to help improve learning outcomes
for online students."
So what they're doing is they're logging the responses and the
evaluations and the small quizzes that students take online to see what methodologies work, and what don't work.
Another example is a medical example and that is researchers from Children's Hospital, Boston
sifted through an anonymized database of more than half a million
electronic patient records to develop a mathematical model that shows a decent ability to detect cases of domestic abuse.
So they're using data that are already collected on patients who come to the emergency room - and based on that,
they're trying to detect if a person is likely to be a victim of domestic abuse. This is a very different
application of data mining yet it contributes a lot to society.
I showed you earlier on
One map of the world which is a website of a conference that's happening worldwide on predictive analytics.
There's another conference called Predictive Analytics World
and many many more. What I'm trying to show you is that more and more organizations - from businesses to
governments, from smaller companies to large companies - they're all beginning to see that there might be value for them in
deploying data mining solutions.
So, in order to be able to use the data that you're going to see in your organization,
you want to know what are other
applications that other companies are using and you also want to know how to think about data mining in a
larger and more creative way so that you can tailor solutions or come up with new solutions
for your particular organization.
So did you figure out what data mining is thus far? Let's see. "We have a gigantic database full of customer behavior information."
"Excellent. We can use non-linear math and data mining technology to optimize our retail channels!" "If that's the same thing as spam,
we're having a good meeting here."
