- [Interviewer] Hello, and
welcome to Experian's Data Talk,
a time when we get together
each week to talk about
big data and analytics.
At Experian, we believe
that big data is good,
good for our economy, good for consumers,
and good for society.
In today's episode, we
get a chance to talk with
Meta Brown about her book,
"Data Mining for Dummies."
Meta Brown has been helping
organizations use practical
data analysis to solve
everyday business problems.
Meta is a hands on data
miner who's tackled projects
with up to $900 million at stake.
She holds a Masters Degree
in Nuclear Engineering from
the Massachusetts Institute of Technology
and a Bachelors of Science degree
in Mathematics from Rutgers University.
You can learn more about
her work by going into
metabrown.com, you can
also follow her on Twitter
at MetaBrown312, that's
@ sign M-E-T-A-B-R-O-W-N-3-1-2.
Meta, thank you so much
for joining our data talk.
- My pleasure.
- [Interviewer] I thought it would be good
if we could maybe start off,
can you tell us a little bit
about yourself, your work,
and how you got down this
path of becoming a data miner.
- Sure.
I'm a statistician, I studied
classical statistics and engineering.
In the 1990's my employer
wanted to get into the
data mining business so
I became a data miner,
and I started demonstrating
data mining methods,
teaching classes, writing
training materials.
And for that same reason,
I also began at that time,
to use and to teach text analytics.
Over the years, I became so
well known for my writing,
my teaching, my speaking,
that I started to coach
other data analysts in
technical communication.
And now, in fact, most
of my clients seek me out
for help improving the
communication between their
technical people, data analysts,
and managers and customers
who are non technical or
often differently technical,
they have a different specialty
and don't speak the same language.
- [Interviewer] Meta, when
people ask you about data mining,
how do you define data mining?
- Defining data mining is
actually pretty simple.
Data mining is statistics for cheaters,
so the object of data mining
is to empower business people
to be able to discover
useful information from data
independently without having
to lean on others to do that,
and the key to that is
providing them with simple ways
to use what are really under the hood
still statistical methods,
but they're using them
without leaning a lot on theory,
or leaning at all on theory,
they're just trying things.
And so the trade off is that
while you find information
quickly you don't have any
theories to back you up,
so you need to use field
testing to balance that out.
- [Interviewer] Meta, I'm
curious about how much
data is needed for data mining?
- Well a lot of people have
the idea that data mining
and the now popular term
data science, are all about
having a very large quantity
of data, that it has
to be whatever they think of
as big data at that moment.
And that's not necessarily
the case, data mining is
certainly a term that's
associated with large quantities
of data and that's not by accident.
It was a really big issue
when data mining tools came
on the scene that the
quantities of data that people
were dealing with were
a lot for the computing
equipment they had at the time to handle.
Aside from the speed issue
and the complexity of
dealing with a lot of
data, the creators of the
data mining software
wanted very much to save
business users as many steps as possible.
So they wanted to eliminate
the sampling step that is
traditionally used when you
have large quantities of data.
But what you really need is
data that's relevant to your
particular problem and that has
the level of detail that you
actually require for the
information that you need.
So let me give an example from
a client that I worked with.
I was approached by a gentleman
who had very, very large
quantities of data and it was
important to him that he had
to have a solution that was
technically able to handle
that large quantity, but when
I looked at what he was doing,
the end product that he
wanted was pie charts,
and it really wasn't
necessary for him to use every
bit of data that he had to do that.
On the other hand, there were
other things that he could
do to take advantage of
that large volume and get
value from the detail, and
that happens when you're
interested particularly in
reaching out to individuals.
So if you have perhaps a
million or more individuals
who you might want to market
to, and you want to give
each of them personalized
marketing, you can imagine
how the volume really adds up.
So there can be value
to volume for certain
kinds of applications, but
no one should think, "I don't
have a huge amount of data,
data mining's not for me."
It really doesn't work that way.
- [Interviewer] Meta, can
you talk about when you do
go and consult at a company
on their data mining and their
data analytics, what are
the types of teams or
professionals you usually work with?
- Well I would say the place
that data mining begins
most often is with the
marketing team, and that's
because in, well I would
say in cross organizations
the team most likely to
have some awareness of
analytics is marketers, almost
all people who've reached
a high level position in
marketing have had some
analytics training even if
it was just a statistics
class while they worked on
their MBA or the Masters in
Marketing, whatever it may
be, and throughout their
careers marketers get a lot
of exposure through things
like testing indirect
marketing and they may be aware
of a wide variety of other techniques.
Now awareness does vary
from person to person and
(speech distorted)
One marketer told a colleague
of mine, "If I wanted to
deal with math, I wouldn't
have gone into marketing."
(laughs)
So it's not always that
team where things begin,
it can often be engineering
or other production teams
and manufacturers,
consultants of various types,
so it's not the same everywhere.
- [Interviewer] I'm curious,
when you began to work with
different marketers or
difference business units,
are there any common challenges you find
that companies are
struggling with, with data?
- Yes.
There are a lot of common challenges.
One thing is that you have
to begin to get some trust
for working with data,
so if no one's prepared
to use the data in decision making,
that's a big challenge right there,
and in fact it's a very common challenge.
So the best place to start,
for any organization,
is where you have a
matchup of someone who has
some influence, even if it's
just a small little corner
of the organization in
a very low risk project,
someone who's willing to try
taking advantage of analytics,
and then someone who may be
the same person or may be
different people, who are
prepared to do some work
and actually produce some useful numbers
Now that's not always so
simple because once we have
someone who's willing to use
the information, then we have
to get the data and getting
your hands on relevant data,
and getting permission
to use it, is not simple.
Now many data miners or
data analysts of any type
will try to work around
proper sources, people resist
working constructively with
IT and that's a very bad idea,
because it's easy for one thing
to think that you know what
the data you have is and be mistaken.
It's also a risk that (speech distorted)
or contractual obligations that you have,
so you really need to
get over any discomfort
that you have in working
cooperatively with your
IT teams, your security
teams, to make sure
that you're handling data properly.
Compared to those things,
actually doing the
analysis is some of the easiest stuff.
- [Interviewer] Wow, so a
lot of it has to do with just
working within the teams,
working the right types
of people, getting people out of their
silos and bringing them together.
(speech distorted)
- Teams is everything, this
is not unique to data analysis
but it's certainly absolutely
necessary for data analysis,
you have to get the
keepers of the data working
cooperatively with the
analysts of the data,
working cooperatively
with anybody who has to be
involved in making a decision
based on that information,
And that's a minimum, you
may also need to bring in
additional people for special
subject matter expertise.
Maybe you need folks who are
in contact with the customers
who can describe experiences
with them, maybe you even
need to talk directly to customers.
So if you're not willing
to work nicely with others,
you're not going to get
value from your data mining.
- [Interviewer] You know your
book "Data Mining for Dummies"
is just an awesome handbook
for anybody who wants to
learn more about data mining,
I'm thinking especially
for somebody who's outside
the field, somebody who
maybe is in marketing, maybe
they didn't take statistics
but could really just gain
a lot of knowledge about
data mining just to help
them in their own profession.
Can you talk a little bit
about your latest book,
"Data Mining for Dummies",
and who have kinda targeted,
like you is your targeted
reader for this book?
- Well the book is written
for the business community
so a business person who
wants to know what's what
on an unfamiliar topic goes
looking for a dummies book
and the object is to give them
what they're expecting there
so very introductory
material, not assuming things.
So for example, we don't
assume that you can program,
and don't assume that you had
even a class in statistics,
though it wouldn't hurt,
and really begin from,
"Okay, here's what data
miners do, here's why
data mining's useful, and this
is what the process is like."
And really all of that you want a sense of
before you start talking about
this or that data mining
technique or algorithm.
A lot of people wanna start
from that point and it can be
very confusing, and not only
hard to understand but really
is not the best way to get your work done.
So you really want to
start from these really
basic fundamentals, even as fundamental
as where does the data come from.
- [Interviewer] In chapter 4
of your book, you write about
the laws of data mining, you
write about the importance of
business goals, data
preparation, modeling patterns,
predictions, is there a
certain law that you cover
that you think gets overlooked the most?
- Well first let me
explain what the nine laws
of data mining are, these
are principles of data mining
practice, I did not create
them, they were first
created by Tom Khabaza
who's really a pioneering
data miner, he's the
founding chairman of the
Society of Data Miners
which is a relatively
new organization, it's
only been around for
little more than a year,
and he was also the
technical editor on the
book so I made sure that
he approved of the way that
I wrote about his laws.
(laughs)
Now the fact is that each
of the nine laws is often
overlooked, even the first
law of data mining which is,
business objectives are the
origin of every data mining
solution, I think it's
a very fundamental idea
and one that I think is easy
for most people to grasp
when it's presented to
them, it's the very heart
of the subject of data mining
but it's often neglected.
And data mining, it's not
for finding information
that's just interesting, it's
about finding information
that helps solve specific
business problems.
So that's I devoted a
chapter to explaining the
nine laws, and just so everyone
understands what's at stake,
that chapter also includes
a real life example of a
business that spent over
a million dollars on
data mining and wasn't able
to use the final result.
And it's actually a very public story,
so the business is named.
I want everybody to have
that cautionary tale of
just because you invest in data mining
doesn't mean you get good results.
But doing what you need
to do to get good results,
is in fact, not at all magic,
it's a straightforward methodical process
that anyone can use.
- [Interviewer] Was there any
part of your book, was there
a favorite chapter that
you really enjoyed writing?
- I enjoyed writing chapter
nine, which is making new data.
That's one of four
chapters that are devoted
to understanding the
major sources of data.
Many data miners only
consider using the data
that is easily available to them.
So that might be, depending
on what your job is
and what your skills
are, people tend to go to
their comfort zones,
so that mean just using
the data that's already
in house, or just using
government sources, or just
using some other source,
and whatever it was, it
wasn't created to address
a specific business problem
that you need to solve.
Sometimes people are choosing
the data based more on,
"Well I'm comfortable
with using this particular
"interface or I'm
comfortable with web API's",
and so that's where they go but
what's often more worthwhile
is to obtain new data
that addresses your own specific problem,
and using that data provides
a unique competitive
advantage because only you have it.
So that chapter addresses
things like survey research,
loyalty programs, those frequent
flyer and frequent buyer
cards that you have,
and experiments, and you
can conduct experiments.
- [Interviewer] Meta, do you
have any advice for those
that are interested in learning more about
data mining but feel intimidated?
- Of course, read "Data
Mining for Dummies",
I wrote it for you.
- [Interviewer] And last,
Meta, where could everyone
learn more about you and
the work that you do?
- Well you can visit my website,
it's www.metabrown,
that's M-E-T-A-B-R-O-W-N.com
I have links there to
dozens of articles on
analytics and communication,
there's contact information
for me and I would be
very happy to hear from
anybody that would like
to touch base with me.
- [Interviewer] Well
Meta, thank you so much
for being our guest in today's data talk.
I want to encourage everyone to check out
Meta's website by going to metabrown.com,
make sure you're following her
on Twitter at MetaBrown312.
I'll make sure to have links
to both her Twitter account
and her website in the about
section of this You Tube video.
I'll also make sure to include
the resources and links
to her book on our Experian
blog and the short URL there
is ex.pn/metabrown and that
will be just a resource page
where this video will be embedded
as well as links to her resources.
Also want to let you know
that we host a data talk
tweet chat every single
Thursday at 5 p.m. Eastern and
we would love for you to come
join our next conversation.
You can find out about
upcoming data talks by going
to experian.com/datatalk.
I want to thank you all again
for joining our data talk
today and we look forward to
talking with you next week.
Take care.
