- Hello and welcome to
Experience Weekly Data Talk,
a show where we talk
to data science leaders
from around the world.
Today we're actually talking about
a very, very important topic.
In fact Kristen, this
is one of those topics
that we get in our data science community
on Facebook all the time.
People are wondering how do I get started
working in data science
and it's probably the
most prevalent question,
the most asked question,
so I am super excited to
have you as our guest.
Folks, we're talking to Kristen Kehre,
she's a Senior Data Scientist
at Constant Contact.
She got her BS in
mathematics from Dartmouth
and then she went on to
earn her Master's of Science
and Statistics.
She has been working in
the field for a long time.
Very active on Linkedin,
answering people's questions.
It's an honor to have you
Kristen as our guest today,
thank you so much.
- Oh thank you so much,
I'm so excited to be here.
- And also what's real cool
is our Data Talk series
finally got approved on iTunes
so if you're interested
in subscribing on iTunes,
you can now go to ex.pn/datatalk
and you can find the podcast
on itTunes, Stitcher,
SoundCloud, Spotify, etc it's all there.
So that's very, very exciting.
So before we get started
Kristen can you kind of
share with us your journey that led you
to begin working as a data scientist?
- Yeah, so I mean it's
probably a long journey
because I finished my
bachelor's degree in 2004
which was a little before
the term was coined,
data scientist, and I didn't know that
I was going to go into data science.
I knew that I didn't want to be a teacher.
I tried a couple of different things
and I found myself in a role
that didn't have a ton
of job opportunity for me
in terms of growth.
And so we're you know
still not at the point
where data science is really a term
but I had seen that people were successful
who had a background in statistics.
I saw a lot of job
opportunities for statisticians
and people who could build models
and so I decided to go back
and get my Master's Degree in statistics
and this was like a very
interesting time too
because this is like 2007, 2008.
The housing bubble is about to burst,
I'm time working in doing
like financing things
for a real estate company
and so job security wasn't looking so good
and it just seemed like
the right time to go
and hide in academia.
(laughs)
So I got my Master's Degree from WPI
and the faculty there was very good
about helping me to get
a job after I finished
and so I originally had
thought that I wanted to be
a community college teacher or something.
But this job came around
that my professors suggested to me.
They said hey do you want to go do
econometrics time-series
analysis for NSTAR?
And you know, sure, that sounds cool.
So I was building you know neural nets
to forecast hourly electric load
and that was used for capacity planning
during heat waves, like it was imperative
that they were able to use
the output of that model
to determine how they were
going to a lot capacity
and how we were going to
keep everybody's lights on.
And I built a lot of time
series analysis models
using ARIMA predominantly
and that has been so valuable
throughout my career.
Any job that I've worked at afterwards
has wanted to know how trends are
trending over time and
how can we forecast that
and sort of what insights
can we gain from that?
So I highly suggest anybody
who has the opportunity,
to learn time series analysis
but from there I realized that there was
a whole world of analytics
and that I was taking part
in just one small piece.
Like I was building cool models
but it was one small piece
and so I moved to the more
broader analytics area
where at this time, I don't know,
maybe data science was a term
but I referred to it as advanced analytics
and so I was using a little bit of coding
and along the way I picked up SQL
and was building models
to drive business value
and I've worked in a
couple different industries
and now I'm at Constant
Contact doing data science.
(laughs)
- So at what point, like
you were doing data science
before the term was kind of coined
like the sexiest job, right?
At what point did you decide like
this is the career field for me?
Like was there a certain
project you're working on
or something that you were doing
that you're like yeah
this is something that
I'm really enjoying,
I'm not going to pursue academia
anymore to be a professor.
I want to stay in business.
- Yeah, so I'm incredibly
passionate about mathematics
and I absolutely love statistics
and I don't know if
it's because I'm a woman
but I've always sort of felt
like I had something to prove,
like I'm not gonna lie.
In that first job where I
was building neural nets
and I also did like a mathematical proof
to show why our choice of T values
greater than one in our model
would reduce the forecasting error
and therefore made it
a more efficient model
because we had to make these
submissions to the DPU.
And I've always felt in the roles
that I've been in highly
valued, I felt respected,
I felt that other people
value my intelligence,
and that was something
that was important to me
was that I was able to use my brain,
think about problems in a
way that a lot of times,
especially now more recently in my career
is really out of the box.
So all of those things are
really attractive to me.
The fact that I have to really think
that it's a skillset that not everyone has
and that not everyone
can both build models
and easily distill that
information for the business
and really advocate for ways
that we could be thinking about things
that could optimize a
process or add value.
- So today's topic is very, very important
to our data science community
because there's so many people
who are just graduating from college
or just finishing up
some certificate courses
in something around data science
and one of the questions we get a lot is
what are the most valuable skills to learn
to help prepare you for a
job working in data science?
- Yeah, so I don't think
we talked about it enough
but SQL is you know.
If you have finished a degree
in a quantitative field
or you've taken some MOOCs
and you haven't yet learned
SQL like you need it.
It is a non-starter
because the majority of
the models that you're
gonna build and the analysis
that you're gonna do is gonna be on data
that's probably living
in a data warehouse.
And you may be working
with big data technologies
but even as an example, Hive,
is also a structured query language
so it's applicable there too.
And in the real world we
don't just get datasets
that are handed to us that are clean.
In the real world we are
joining across different tables
in a database to structure the data
the way that we need it to be for analysis
and so if you don't have
that ability to get in
and self-serve and get that data yourself,
you're gonna be really hindered.
- Yeah, I don't think I've heard a lot of,
I mean SQL has been mentioned
in previous broadcasts
but a lot of times people
are focused on talking about
programming languages like R and Python
and I'm really glad
you're talking about SQL
as being one of the most
important languages to learn
to help you with their structured data.
- Yep, I mean because I of course
I'm using R and Python every day
because I have a problem
where I can't make a choice.
All my models were built in R
up until probably six months ago
and then I started
making the move to Python
and so now I've been doing
a lot of coding in Python
but I still leverage R
every once in a while
because I just can't
completely let go of it.
(laughs)
And sometimes use RPI too
to call R through Python
but I think for being
able to walk into a job
when it is your day one like
what are you gonna be doing?
They're gonna say here's our database
and this is what you're gonna be using
to get the data to make your models.
So I think as much as we like
to talk about cool technology
and we certainly can do a lot
of cool things like SQL Man.
SQL. (laughs)
- All right, you heard it from Kristen.
SQL man, I'm gonna quote you on that.
(laughing)
It's all about SQL man.
(laughs)
No, that's good.
I think that's really valuable
advice for our community
'cause like you said day one,
what are you gonna be doing?
Here's our datasets, you gotta
start playing around with it.
You need to know SQL
and I don't think that is talked a lot
and that's something that
we don't talk a lot about
here in Data Talks so thank
you so much for sharing that.
Recruiters get really
inundated with applications
and I'm on LinkedIn
and pretty active there
so whenever there's jobs,
like even here at Experian
for data science roles,
I'll get a lot of messages on LinkedIn.
Hey, can you share my
profile with a recruiter
and I'll talk to
recruiters and they'll say
they get like 300 resumes in a
week for a data science role.
What advice would you have
for somebody who is brand new,
just starting out,
they don't have any work
experience to speak of,
how can they stand out
in this pile of resumes
or LinkedIn profiles that
recruiters are looking at?
What can help them?
- Yeah, so I think there's definitely ways
that you can leverage LinkedIn in terms of
I don't think people think enough about,
like you can go and
connect with a recruiter
and comment on his stuff
and not just when he or she
has a job posting available.
Like proactively go and
engage in conversation
and add value for that person
so that when you do go
and you send him or her
a message on LinkedIn,
they know who you are.
You're not just another random person
who's messaging him.
You're the girl that has
been talking in his comments
and he's interacted with you.
And then the way that you
position yourself on LinkedIn
when you're reaching out to these people
is you really want to make sure
that you're not talking about what it is
that you're looking for.
You want to talk about how you believe
that you fit that role.
So if you reach out and you say...
I'm trying to think of how I say it
because on my last job hunt
I absolutely was leveraging LinkedIn
but you know you reach
out and you say hey,
I've noticed this position.
I'm a person who is
skilled in Python, SQL,
and mention some different
types of modeling
that you've done and it can
be from your coursework,
you don't have to say
it's from your coursework
but you just want to make
sure that you get a response
and that you get noticed, right?
So we just want to position
ourselves differently.
Showing that you can because if you can
write some Python and you can write some R
and you can do some SQL and
you can do some modeling,
you don't need to say like
oh I learned this in school.
You can just say hey, I'm so-and-so.
I saw this position open
and I'd love the opportunity to speak with
the correct person.
I'm looking to get my
resume in the right hands
and I have experience with
Python, SQL, data analysis,
and building models.
And I hope you'll you'll forward my resume
to the appropriate person.
But I think that...
it's not going to work all the time
but I think that there's some things
that you can be aware of and
really think strategically
when you are going to
reach out to these people
and if you can try and
build a relationship
before reaching out that's all the better.
- I've seen some different
comments and blogs,
people talking about the value of
posting your portfolio on GitHub
but I've also read some
blog articles about
GitHub is pointless for
putting your portfolio.
kind of curious about your thoughts on
where to put your portfolio
work as a data scientist.
- Yeah, so I mean I have
an enterprise GitHub
so if you were to check my personal GitHub
I don't even have my projects there.
However, I do have a blog
and so for anyone who's looking to
really highlight their
skills and abilities,
if you want to go ahead
and do it start a blog.
It is incredible for so each
time you write an article,
post it on LinkedIn, post it on Twitter.
Get people's eyes on your blog.
Don't just post it and leave it there
but share that article on LinkedIn.
Not only because GitHub for code, right?
And you know some comments and a readme
but demonstrating that you were able to
do these technical things
and that you were able to talk to them
in a way that other people can read
is a highly valued skill.
I only started my blog in March
and it has opened up a ton
of opportunities for me.
Like I can't even speak to it enough.
- That's awesome to hear.
We just actually got a
question here on Facebook Live
from a Software Engineer and he's asking,
this Operator Engineer
is looking to break into
the data science world
and what do you suggest
for professionals like us?
- Yeah, so I get that question a lot,
especially from people who
have just finished a CS degree
and they want to get a
job in computer science.
And my thought is always to take a CS job
because they're no joke, they pay well.
You can find fantastic jobs
in Computer Engineering.
It's not talked about as much as like,
they talk about data
science is a hot topic
but you know people are
clawing to get at developers.
And as you're working as a developer
you can take some MOOCs at night
to get that machine learning piece
and then when you go to
position things on your resume,
there's certain things
that you're already doing
that are of huge value to a business
if they are looking for a data scientist.
Things like automating processes.
So you're able to take MOOCs at night
and and learn a couple skills
and put them on your resume
and then just start applying
and working on the way that
you're marketing yourself
because the marketing yourself
for a data science position
is like a huge piece of it.
But yeah, I think with a CS background
you're in a fantastic spot
to hop into the field.
My boss was from a CS background.
I know a number of people
from CS backgrounds
who make the move and having the CS first
kind of gives you a leg up.
- You just mentioned about
just the importance of
continuing to learn and
I was kind of curious.
I do see lots of posts on LinkedIn
about different data science
boot camps that are out there,
Courses on Udemy, Coursera.
I'm kind of curious about your view of
those different types of courses
and how you viewed the
certifications of those courses.
Are those valuable?
- So I don't necessarily
see the certification,
like the paper itself as valuable.
Certainly add it to your
LinkedIn because why not?
Maybe don't add it to your resume
because that is prime real estate
that you need to really
think strategically.
You only have one page.
I don't like it when
I see two page resumes
but in terms of the courses themselves.
Yeah, highly valuable.
I actually had asked
on LinkedIn a while ago
for people to suggest their favorite ones
and there's a Python A to Z
course that everyone recommends.
So you can see the courses
that other people recommend.
Right, social proof.
And try and take ones that are good
because some of them, not not
all courses are created equal
but I've personally taken a MOOC on Git.
I've taken a MOOC on Python.
And when I decided to make
the hop from R to Python
I started with a MOOC.
I also used Codewars which is free
but I was able to learn just a little bit
about web scraping, about
writing to a database.
Now I have been using databases for years
and I had helped move data
that was not in a great schema
or there was just like a
bunch of snowflake tables.
I had helped in the transition of how to
structure that data as we
moved over to a star schema
but I had never actually
written to a database.
So I feel like I pick up
little nuggets of awesomeness
sort of everywhere.
And even this far in my career
I'm still always every month or two
taking a new MOOC to just I don't know.
I just find them fascinating.
I love that after my kids go to sleep
I can watch some video
and learn something new
but that's just part of my personality.
- Is there certain online classrooms
you recommend over others
because like I said
there are so many boot
camps that are out there
and I was kind of curious when you did
your question on LinkedIn
to your community
and you just mentioned
one of those courses,
was that on Coursera?
Was it on Udemy?
Do you happen to remember?
- So the course that I took,
Python for everybody is on Coursera.
Most of the courses that I
have taken are on Coursera
'cause it always to me I liked the fact
that it was coming from
an accredited university
but at the same time I've
heard other people's opinions
on other courses as well
and I can't say that I know all there is
about sort of the landscape
and who is out there.
I just know that they are valuable.
I actually thought about taking a bootcamp
the last time I was switching jobs.
I was thinking about becoming
a full-stack developer
but of course, I stayed in data science
because that's sort of where my heart is
but just the idea of
taking a couple weeks off
the job search and learning something new
just sounded like so much fun.
But yeah definitely
sorry, can't help anymore
with helping to narrow down
the most useful platforms
but I have always had luck with Coursera
and it's worked for me.
- Nice.
I've got another question
here on Facebook Live.
Hey Kristen, I want to know
what are some of the tools
that I can learn and
practice SQL and T-SQL from?
- Yeah, so I think...
So the first option is
always to take a MOOC.
Secondly, I think a lot of
people don't realize that SQLite
is open source and so
you can download that.
You can find different data sources
whether you go into Kaggle.
Some of those are a lot
of times really large.
I'm trying to think of I'd seen
a really great data website
before that it aggregated a ton of...
Kyle McHugh on on LinkedIn.
He has an article that he lists like
20 free online courses or something
or 20 data science resources and number 20
was a website that had a ton of free data.
That it was just there and
it was like this huge site
this person had like aggregated
a bunch of data resources.
But yeah, I mean you can set up SQLite,
you can download data,
and now you are playing in there for free.
And of course there's YouTube tutorials
if you didn't want to actually take a MOOC
but there are absolutely MOOCs out there
for specifically learning SQL and T-SQL.
- Awesome, so what are some
common interview questions
that a new data scientist
should be prepared to answer
when going in for a job interview?
- Absolutely, I mean when
you get the phone screen
and you pick up the phone
they're gonna say hi Kristen,
is this still a good time to talk?
And you say yes and then
they're gonna say okay, great.
I have your resume in front of me,
can you please tell me a
little bit about yourself?
And so here you don't want
to give away the farm.
You're looking to show that you can
explain who you are in
a concise sort of way.
I typically go with
something along the lines of
I'm a Data Scientist with
eight years of experience
working across healthcare,
the utility industry,
and I'm currently in E-commerce
and I have a Master's Degree in statistics
and a ton of experience
building different models
that I'd like to tell you about.
It's just like three lines.
You know so that was sort of off the cuff.
I haven't interviewed
and like over six months
so not my best work but you get the idea.
- Yeah, yeah, yeah.
(laughs)
It's like your elevator
pitch, I like that.
- Yeah, yeah, you have to
have the elevator pitch ready.
And then the other questions,
I mean there's really four specific ones
and I'm sure I'll only
be able to remember three
but tell me about a time
with a difficult stakeholder
and how it was resolved.
Tell me about a time that you
explained technical material
to a non-technical audience.
Yeah, can't remember the other two
but they're in my one of my blog posts
and I was asked these same
questions over and over again
and so those are the behavioral questions
that you're supposed to
answer in the star format.
So you clearly give sort of
some context and background,
talk about the problem,
talk about the solution,
and the results and then you
know of course end it with
and that was a time that I worked
with a difficult stakeholder.
- And I'm gonna make sure
after this episode is over
to link over to that blog
post that you just mentioned
with those four questions.
And for those listening to the podcast,
the URLs ex.pn/datatalk55
and that is a place where I'll go ahead
and put the links to
Kristen's article for those
that are interested in
those behavioral questions.
Those four key questions
get asked quite a bit.
I think those will be
really valuable to go over
and I think the one that you mentioned
that I thought is really
interesting is about
the communication aspect.
Communicating something
that's very technical
to somebody who's not
familiar with that jargon.
Can you talk about how
important that communication is
for the data scientist?
- Oh my god, it's so incredibly important.
Like I pretty much promise
you that if you have
four interviews with
four different companies
you will be asked this question
at least once if not more.
And I think that when we
talk about data science
a lot of times if somebody said to you
what's a data scientist?
A lot of people are gonna say well,
it's somebody who writes code.
Maybe production level code.
It's somebody who builds models.
It's somebody who does analysis
but like the big pillar of
that is also business acumen.
And I see data science as
a very cross-functional
interdisciplinary field
where I am routinely working
across all sorts of departments
to understand their needs
so that those can be inputs into my model
because if I build out this
beautiful cluster analysis
but I hadn't talked to
other areas of the business,
it may not be something
that they want or need.
So you're getting buy-in
first across the organization
and having that buy-in is
what allows you to have value.
And then at the end after I build a model
I'm always presenting
that model afterwards.
And I'm not talking about
the Fourier transforms
that were used to determine whether or not
a customer was seasonal.
I'm talking about what
percentage of our customer base
was identified to have seasonal patterns
and what do these seasonal
customers look like?
and so when I answer
the technical question
myself on the interview
I typically start with,
I give this example from when
I was working at Vistaprint.
so at Vistaprint I was asked to do
a behavioral cluster analysis
of our digital customer base
but when it came time to
talk to the stakeholders
I brought it up a level
and so I'm clearly telling
the person that I'm talking to
that I was bringing it up a level.
I was like I brought it up a level.
I wasn't talking about the methodology.
I was talking about the
size of the opportunity
and the behaviors that
each of those clusters had.
sO I wasn't talking about
hierarchical clustering
or PAM like I was talking about
we have this group of high spenders
who are very highly engaged,
there they're utilizing all of
the resources at their disposal
to get in contact with us
if they need help and other things.
And talking about you
know the other groups
and so that's you know
the picture that I paint
during the interview for this person
when I'm asked how did I explain
that technical concept
to a non-technical person
and the answer is I brought it up a level
and I talked about the
opportunity yada, yada, yada.
- I love that, I love that.
I mean that's that's a
huge skill to talk about
something very, very technical
and then bring it up to a level
where what you're saying is gonna be
vitally important to the business leaders.
Because they don't care
necessarily about the model
or algorithms you're working with.
They care about the insights
and how it's gonna help
grow their business, right?
And so to be able to talk to that level
is gonna be crucial for your success
as a data scientist.
Have you ever while presenting
had to deal with pushback
from leaders, like they didn't
agree with your results?
Did that never happen?
- I'm trying to think.
I mean there's always
questions and feedback
especially when you are
presenting to senior leadership
and you're a data scientist
and you report up through maybe marketing,
maybe a different department, or whatever
but if you are presenting
to an area of the business
and senior leadership that's sort of
outside of your wheelhouse,
like they're absolutely
going to have questions
about things that you
maybe haven't thought of.
And that comes back to
getting that widespread buy-in
that'll help to mitigate
some of that proactively
but absolutely.
People are going to have questions
and they're going to challenge you
and that doesn't mean that
what you did was wrong.
It's an opportunity to look
at your analysis another way
and maybe improve it or maybe
add an extra dimension to it
to help fully explain the story.
- Well I know our time is up.
I have just one last question
and I actually found
this question on Quora
from a bunch of people were wondering.
And the question was
what do you experience
data scientists know
that beginner data scientists don't know?
- So much.
(laughs)
Well because I speak to a lot of people
and when I've hired in the past,
yeah you just don't
realize how much you know
until you have to break it down.
Especially since we were
already talking about presentations,
I think that is one of the big pain points
is that most people out of school
who start their job in data science,
they don't know how to
write a deck effectively.
It's not one of those things
that was covered in school.
Maybe if their background is in business,
maybe you got it but if you
came out of a statistics
or CS background I'd bet the farm
that you're first deck
isn't gonna be pretty.
And it's that same idea.
It's that we're not...
Like in grad school we
spend a lot of slides,
talking about the methodology that we use
because that was correct for that audience
and now in business the
audience has changed
and our presentations need to
be structured to reflect that.
So starting with a high level overview.
Talking about only the
insights that are important
and not necessarily the
details that went into it
and finishing with a summary
and talking about your next steps
and including nice visuals
and maybe using a branded
template and the verbiage.
But yeah I've seen some
really bad presentations
and if I was going to pick one thing
because people will pick
up the technical things.
Maybe you haven't had the opportunity to
build the type of model
that is the best solution
for the project at hand.
But there's also opportunities where
I'm gonna be leveraging new methodology
and I'm gonna research that
even though I'm nine years in.
So that's something that's
constantly growing and evolving.
But yeah, presentation skills.
- So this is a been a
wonderful discussion Kristen.
Super valuable to me, to our community
because there are so many people who are
trying to get their foot in the door
to become a data scientist
and all the things you've
shared in today's episode
is just super valuable.
I want to let everybody know that
Kristen is available on LinkedIn.
Kristen where can people
follow you, reach out to you?
Where is the best spot that
you'd like for them to go?
- Yeah, so I have a blog.
It is Data Moves Me
and so I'm pretty active on my blog
and also you can find me on LinkedIn
and I'd say that's probably
where I'm the most active.
- Okay, is it datamovesme.com?
- Yes.
- Okay, perfect.
I just put that URL on the screen.
Check Kristen out, engage
with her posts there,
and like I said I'm gonna be
putting a link to her LinkedIn
profile on the Experian blog
and again the URL is just ex.pn/datatalk55
and we'll have a link
to her LinkedIn profile,
to her website, as well as her article
that she wrote about the common questions
that you'll get when you're interviewing
to become a data scientist.
Kristen, thank you so
much for your time today.
We actually got tons of
questions in the queue.
Unfortunately we couldn't get to them all.
So if you do have questions for Kristen
please go to her blog, post
them there, reach out to her,
and network with her, and
follow her definitely.
Kristen thank you so
much for your time today.
- All righty, thank you
so much for having me.
I really appreciate it.
- Okay, take care.
- Okay, you too.
- Take care.
