Santona Tuli:
You get to shape your path.
You get to do what you want to do.
If you're curious, if you are excited and want
to try things and want to learn things,
things will fall into place and
you will figure it out.
And that is not to say that
I haven't had setbacks or failures.
It's just about how you look at it and
it's about getting back up and keeping going.
Harpreet Sahota:
What's up, everyone? Welcome to another episode
of The Artist of Data Science.
Be sure to follow the show on
Instagram @theartistofDatascience and on Twitter at
@ArtistsOfData. I'll be sharing awesome tips and wisdom
on Data science as well as clips
from the show. join the Free
Open Mastermind selection by going to
bitly.com/artistsofdatascience where I'll keep you
updated on biweekly OpenOffice hours.
I'll be hosting for the community.
I'm your host Harpreet Sahota.
Let's ride this beat out
into another awesome episode.
And don't forget to subscribe,
rate, and review the show.
Our guest today is a physicist and Data
scientist who loves delving deep into Data to
learn insights that may be hidden by noise.
She's earned a bachelors in physics and
mathematics from Trinity University and has gone
on to a PhD in physics specializing
in nuclear science and quantum chromodynamics from
the University of California, Davis.
She currently leads a team of five
doctoral and post-doctoral physicists studying a new
plasma phase of matter in the elusive nuclear
effects in high energy Proton and nucleus
collisions at the Large Hadron Collider
at CERN in Geneva, Switzerland.
She's got a knack with thoughtful feature
in engineering to extract maximum value from
Data while simultaneously reducing
the Data significantly.
She also emphasizes avoiding overfitting, identifying
systematic bias and validating all
results.  Her favorite part of Data science.
It's all of it, and she enjoys end
to end project oversight and everything from designing
and developing to testing and productionizing
using statistical data analysis pipelines.
So please help me in
welcoming our guest today.
A woman who is excited by
decision intelligent Data science, Dr.
Santona Tuli.
Dr. Tuli, thank you so much for taking time
out of your scheduled to be here today.
I really appreciate you coming
on to the show.
Santona Tuli:
It's an absolute pleasure to be here.
Thank you for having me.
Harpreet Sahota:
Talk to me a bit about your path
into Data science or what sparked your interest.
Where did you start?
How did you get to where you are today?
Santona Tuli:
Yeah, absolutely. So I was on a path to
studying the fundamentals of our universe, but it
turned out that I was also
on a path to Data science.
So since a lot of physics is trying
to explain phenomena that we observe, it involves
making a lot of observations.
Right. So that's collecting data and
then identifying patterns in the data.
And that's the analysis aspect.
So by doing physics with massive datasets
from particle collisions, which we will go
into, I have been doing data science for the
last several years and that's been a lot of
fun.
Harpreet Sahota:
Ok, so I gotta ask, what
the heck is quantum chromodynamics?
Santona Tuli:
So it's a theory of the strong nuclear force.
So there are four
fundamental forces in nature.
And one of them is nuclear physics.
Just like gravity and electromagnetism strong nuclear
force is another force and it acts
at very, very short scales
like inside a proton.
So quantum chromodynamics is the model, if
you will, the theory that describes the
behavior of this force.
Harpreet Sahota:
That is very, very
interesting, very fascinating stuff.
How does all this tie into Data science?
How do you see Data science affecting the study
of nuclear forces in, you know, the next
two to five years?
So our field of data science has been around
since before Data science was kind of hot
right. So we've had these mysteries and
particle and nuclear physics trying to understand
why the universe is as it is.
And we built these large particle accelerators
and detectors and produced massive amounts
of data. So without very clever Data science,
there's no way we could be looking into
that data and coming up with
the answers to these questions.
It plays in we go through and as we
evolve our techniques and we borrow from what other
practitioners are practicing
around the world.
We can only get better at being at
going through that data and answering those
fundamental physics questions.
So what do you think it's going to be the
next big thing in Data science in the next two
to five years?
Santona Tuli:
That's a really tough one.
So I can think of three ways of
responding to that in terms of technology.
I think NLP natural language processing and
like semantic understanding graphs and using
machine learning is booming.
And so we're sort of in a similar way
to how computer vision was booming maybe five years
ago. So in terms of like field of data
science, that that's where my thought goes in
terms of the big picture and tech, that's
going to be the next big thing.
I think it's going to be empathy, particularly
for data science, ethics, making sure we
mitigate our biases.
That's gonna be a very,
very important thing going forward.
So there have been more and more
discussions recently about thinking through data science
projects holistically and not just as
a chain of discrete engineering tasks.
And so when you think about things holistically,
you're better able to think about their
impact, the impact that they're going to
have on real people on this planet.
And then third, and finally, in terms of
applications, I am really excited about data
science and AgTech and sustainability.
Course, there have been lots of advancements
in healthcare, which which has been
wonderful to follow, and an EDtech, another, you
know, very in other fields where there's
a lot of ongoing work, especially with NLP, like
with a lot of text documents that can be
used to make education
better and more accessible.
So those are very, very interesting as well.
Data science applications are often about increasing
efficiency by teasing out more value
from current system slate systems
that are already in place.
We apply Data science, supplement these systems
with insights from Data and make them
more efficient. And there is huge potential to
do this in the space of agriculture
globally. I think so that's the thing
that I'm most looking forward to.
Harpreet Sahota:
So we've talked about some of the positive applications
of data science in the next two to
five years. You mentioned a few right there.
But what do you think would be the
scariest applications of data science and machine
learning in the next two to five years?
Santona Tuli:
I'm going to answer that by sort
of weaving together my previous two answers.
So data science will have a positive impact
in the immediate future by not having
negative impacts. So by becoming more introspective,
Data science applications can have a
ton of positive impact everywhere.
As I said, all of these different fields.
So we've seen examples of machine learning and
AI that try to mimic sort of human
behavior from from Data, by learning from data
and in that they have amplified some of
our isms right.
We've seen those those things happen and that's
that's sort of one of the worst things
that can happen is the AI or the
machine learning algorithm sort of learns our worst
aspects of us as society
and then somehow magnifies that.
So in my mind, that's the scariest timeline,
right, that if we collectively fail to be
responsible and conscientious human beings.
And so by being cognizant of that, by
being empathetic and ethical, I think we can
definitely avoid that.
And you'll notice that this is I'm
talking about us humans as Data scientists.
Not about you know AI, I'm not I'm
not scared of the artificial AI overlords.
I'm scared about how we approach this field.
Harpreet Sahota:
What can we start doing today to
become more empathetic, more conscientious, Data,
scientists.
Santona Tuli:
Introspection, just in very general terms.
Every project that we work on, just thinking
through where the data's coming from, why the
Data looks as it does so, identifying our
biases starting from the data collection level
like, am I... This problem that I'm trying to solve,
I have to be able to solve it for
everyone, let's say for in a certain you know,
for the product market fit there is a
certain subset that I'm targeting.
But at the same time, within that population,
I shouldn't be I shouldn't discriminate in
any way. So thinking about where my training Data
is coming from and why it looks the way
it does. Balancing classes if that's
something that needs to be done.
I'm going out of our way to
get more balanced and less biased Data.
Starting from there and then just throughout
the entire process, really thinking through,
asking ourselves the question, why
are we making this choice?
Does it make sense? Is it going to
have a positive impact, if not rethink?
Harpreet Sahota:
So you may have covered this in your response
right now, but I'm curious, you know, what
you think will separate the great Data scientists
from the good ones in this vision of
the future that you have?
Santona Tuli:
We often think of greatness in terms of
success, metrics like, you know, if you're if
you've gotten five promotions in the last five
years then you're a great data scientist.
But in order to separate truly great data
scientists, we have to think about what they're
doing, their actions and their willingness to say
no when asked to work on something, you
know, that doesn't jive with
their values and views.
So, yes, I think this
the way I'm framing things.
It's a whole, we have to think about it
holistically and we have to sort of be
introspective and look forward.
And by doing those things, not only can we
benefit society, but we can also set ourselves
apart as great scientists
or great data scientists.
Harpreet Sahota:
Absolutely love that response, that 100
percent agree with that as well.
Need to be more thoughtful about the way
we're doing our work and the implications that
will have downstream. Even if you don't think
that your end result, whatever product it
is that you're building, is
actually gonna have an effect.
You should still consider the
effects that it has.
Really interested to get into the
work that you're doing at CERN.
First, tell us your what is CERN?
Santona Tuli:
So CERN stands for the
European Organization for Nuclear Research.
It's a weird acronym, I know, because
it's an acronym from the French term.
And this is a huge research complex.
You can sort of think of it as a parallel to
NASA only to the extent that it is a huge
research research lab facility that is
focused on one sector of physics.
So like NASA is that
space and sending astronauts out.
We at CERN are interested in going sort of
inward and going to the smallest scales and
figuring out what fundamentally
things are made.
We have a Large Hadron Collider.
It's called the Large Hadron Collider.
It is a circular particle accelerator.
Its 27 kilometers in circumference.
So it's very large and there are
towns that live on top of this.
So the accelerator ring is underground, about
roughly 100 meters underground on average.
So, you know, it crosses,
it straddles France and Switzerland.
So there are no Swiss towns and
French towns just like above ground.
And people have no idea that
this thing is under underneath them.
In some cases. But so what we do is
we accelerate particles, use this massive ring in
order to keep adding
speed to little particles.
And until they get to very high
energy is that's when we collide them.
Hence collider. And then we study
the product of those collisions.
So we try to take a snapshot of
the collision by building particle detectors around the
collision point. So those constituted of
hundreds of millions of sensors.
Very simple sensors in the sense that, you
know, it's like a bit reading out, whether
there was something that the
sensor picked up or not.
And, you know, when you have hundreds of
millions of those, you can put different signals
in different sensor readings out in order to
get deeper knowledge about what was actually
happening in that collision.
So that's what we tried to do.
We tried to capture what happened and then
analyze that data downstream and really figure
out why particles interact the way they
do, why forces interact with each other.
Or forces are applied in
the way that they are.
Harpreet Sahota:
So speaking of particles, while I was doing
research for you that came across this concept
or this Y particle?
What is this Y particle?
Santona Tuli:
The Upsilon particle.
It's called Upsilon. And so the Greek
symbol for that looks like a Y.
So we often just write Y this.
So this Upsilon particle is a
meson, which means something to physicists.
But it's basically it's a particle that's made
up of two quarks that are oppositely
charged. And more specifically there
to Quark in an antiquark.
So quarks are fundamental particles.
If you go inside a proton, which we don't
really think of going inside a proton because,
you know, it's like us elementary particle and
it makes up the nucleus along with
neutrons, makes up an atom,
atom, are building block.
So that's sort of how we learn
about how things are made up.
But if you actually look inside a proton,
you'll find that that also has building blocks.
And those are quarks.
So there are a few different types of quarks
and a lot of they have interesting names.
There's top and bottom and charm and strange.
So the Upsilon particle is made up of
a bottom quark and an ant bottom quark.
And it's a very heavy particle.
So it doesn't exist.
Normally, it's only created in
these very high energy collisions.
Then we try to really leverage that in order
to see how this particle behaves and what it
can tell us about the early universe.
Harpreet Sahota:
Are you an aspiring Data scientist struggling to
break into the field well then check out
DSDJ.Co/artist to reserve your spot for a free
informational webinar on how you can break
into the field? This will be filled with
amazing tips that are specifically designed to
help you land your first job.
Check it out, DSDJ.CO/artist
That's super fascinating.
What does what do all these things have
to do with data science and machine learning.
Like, you know, talk to us a bit about
your how are you applying it to solve these
problems? Like what type of problems are you
solving, how are you using Data science to
make progress against them?
What's your workflow like in this space?
How do you go from data to decisions?
Santona Tuli:
Yeah, absolutely.
So the we collect these particles
and we have all this data.
Right. So we sort of covered that.
The data is readouts from the sensors.
And in real time, when the particle
collisions are happening, we're actually generating
petabytes of data per second.
So that is a lot.
We're, of course, not able to store that, all
of that or even process it in real time.
We have, you know, bandwidth limitations about
how much data we can siphon off.
So a lot of data science, more on that
data engineering side sort of goes into building
out those filtering algorithms that are going to
decide what to siphon off and filter on
in real time from these collisions.
And the simple example, in one particular
collision event, we might get thousands of
ordinary particles, let's say
electrons being created.
Now, that's not a super interesting
event because we see electrons everyday.
That's not what we're after. Right.
Remember, the Upsilon particle is, in my case,
my team, that's that's what we're after.
So what we try to do is figure out
what sort of signature and epsilon particle will leave
behind and use that to filter on this massive
data so that we're only keeping the events
where the Upsilon particle
may have been created.
Now, one of the cool things about this
field, the physics applications of data science,
is that we never have labels Data.
So it's kind of a we have we have huge
data, but none of it is ever labeled because the
whole idea is we're trying to figure out
what what was created and, you know, what's
going on. We have to have a very thorough
understanding of both the physics of what we're
looking at, but also the data.
So in terms of what is my capability with
my sensors, with my particle detector, what am
I going to pick up? What
am I going to leave out?
How do I compensate for what
I'm not able to collect?
What data I'm not so thinking really think through,
like how the data is biased in the
first place and working to compensate
it and correct for those effects.
So to me that that's
one of the interesting bits.
And then so starting with building these algorithms
to collect the data all the way
through so we reduce the data.
Still, even after we filtered on the right
right Data like there's as a data scientist,
you'll know that the signal and noise
still have to be sort of separated.
And there are going to be
regions where the signal is more.
There is the signal to noise ratio is greater
and there can be selection cuts that we can
make to strongly leverage that.
So all of that is as one aspect to me.
I really enjoyed the
feature engineering aspect.
Actually, personally, like ML modelling is fun and sort
of that the end product is fun as
well. But feeding what you feed into the
machine learning model, of course, is extremely
important. And to spending that time feature
engineering, really looking at my Data
thinking about the physics behind it.
Thinking about my constraints and finding the best
features that are going to be useful
to me and like putting different features together
to make better features and so on and
so forth. So that's actually I would say probably 60
to 70 percent of what I do on one of
these projects. And then ML models.
So for machine learning at CERN, we typically tend
to use simple models, more like what is
called traditional machine learning.
So a lot of
regression, classification, clustering.
But more recently, we have had efforts to
look into like deep learning techniques for and
for the same sorts of analysis.
But again, the fact that our data
isn't labeled throws in a wrench.
And then the other thing is that we tend
to only say we have seen something when we're
absolutely sure we've seen it.
So you may have heard of this,
the five sigma rule and particle physics.
So it's only when we have very high
statistical confidence that we're going to go ahead
and actually say that, OK, we've seen the
Higgs-Boson or this particle or that particle.
So that's another reason thinking through the
machine learning algorithm and being having
it be very explainable.
It is important to us.
So a lot of thought goes into
the machine learning models as well.
And we try to air on the side of
simple rather than a complicated and then processing
whatever is put out again, just just like
you can't provide any Data into a machine
learning model. You have to put forethought into
how you're going to shape the data.
On the other end as well, when something is
spit out by that model is still that doesn't
mean anything. Right. You have to transform that
into an actual indicator that that means
something that other stakeholders are
going to care about.
So that's the other end of the process.
And as you said in my intro, I really enjoy,
you know, the whole end to end process and
every part of it.
Harpreet Sahota:
So fascinating,  and so awesome.
You can really hear your passion for
the subject as you're describing everything.
I'd really love to delve a
little bit deeper into this.
It's pretty interesting, right?
Because, typically when we're working and let's say
we're doing a fraud detection type of
problem. We have a situation where we're trying
to sample the class that's not as
frequent, but you're in a situation where you have
to kind of downsampled the noise in a
sense. So it's kind of
like the reverse problem.
Is that what you mean by Data reduction?
Like, can you talk to us
about what Data reduction is?
How important is in the work you're doing?
You touch on some
bottlenecks that you're facing?
Santona Tuli:
Yeah, you're absolutely right.
It is sort of sort of the
reverse in some sense of that problem.
We are not really able to amplify the signal
just because we have to sort of a lot
of the times the item that we want to
report is how much of a particle was created.
And if we up sample, you know, the production
of the particle, that's that sort of brings
it into question. Like, you know, we don't actually
see this or why are we sort of basing
our ML models on on this upsampling?
So, yes, you're absolutely right.
What we instead have to do
is prune the noise away.
A lot of the times some of the techniques
that have worked well, at least with my
analysis, with my analysis, is looking at the
feature space in its entirety and looking
at and placing various selection cuts and
really seeing what effects they have.
Moving around the selection cuts and seeing what it
does to the signal class and the and
the noise class and sort
of optimizing those selection cuts.
And, you know, and also the other aspect
of that is rules based clustering is also
something that I use a lot.
Just because I know something about
how this particle should behave.
And again, what my detector is like, I'm able
to set certain rules about, OK, if you see
this, this, this, this, this criteria being checked
off, then, you know, you can probably
classify it as this Upsilon
particle or this other particle.
So those are some of the techniques.
And. Yes. So because the data is big and the.
So in my particular this last analysis that
that I've been leading, we're just about to
put the results out.
In the end, I was able
to extract about 5000 upsilon particles.
But the data that I started
with was tens of terabytes.
Right. So it's really we've identified the
events where this particle is produced.
And then from that, thousands of particles created
in that event I have to identify that
the upsilon was created.
And so it's it's
really, really interesting work.
And again, we try to use
really simple techniques whenever we can.
And really what I'm what I mean
when I say simple is explainable.
Does this make sense?
Is this something I can defend
and explain to someone else?
And they will be convinced that,
yes, this choice makes sense.
So that's been the secret to dealing
with various data bottlenecks and Data reductionist
problems.
Harpreet Sahota:
Isn't that beautiful? Even when trying to
explain what makes up the universe, parsimonious
models are the best.
Questions based on what you're saying
here, just kind of for definitions.
What do you mean by selection cut?
And can you also kind of maybe, you
know, namedrop a couple of these rule-based clustering
algorithms that you're using so that you know, our
audience can go look this stuff up on
their own time.
Santona Tuli:
Sure. So our Data is event Data because
the collision events are separately sort of we
tried we can we're able to take a picture
of a single collision event, which is really
cool. And the other ask the other descriptor that
we use a sensor data because, oh, yes,
we are getting these readings
out from the sensors.
So using what my sensors are reading
out per event, I can reconstruct.
So it's sort of like back propagating.
We can I can reconstruct what was happening
in the collisions to some extent and with
some degree of confidence.
Right. So I will say let's say that I saw
an electron in these few sensors that tell me
together that when I combined that information
together, it tells me that this electron
was traveling in such and such a direction,
let's say, you know, upwards north and what
not. So that's one point of information.
And then from a different set of sensors, I'll
be able to say, yes, that was an electron
because it was a negatively charged particle.
And I think it had
this momentum or this mass.
So all of these things. So a lot of
cool stuff goes on really at that hardware software
interaction level. And really reconstructing back
to what happened in these collisions,
which to me is is always really interesting
because we're trying to look back and answer
questions. Right. A lot of physics is
cosmology or early universe physics is OK.
What? What happened?
You know how the Big Bang happened?
What happened after that?
And so on and so forth.
So this looking back is sort
of very, very ingrained into physicists.
The way for this is think.
So I really think it's super cool that we're able
to have this data and be able to really
answer the questions of what was
going on that created this data.
And a lot of data
science problems are the same.
Right. I mean, we tried to do predictive,
we tried to build predictive models in certain
spaces. And that's really useful as well.
But a lot of the time, the way we achieve
that is by looking at the data we have today
and thinking back or looking back and discovering
what caused that data to exist in that
way. And based on that, we do the
predictive, you know, the algorithms to figure out
what's going to happen next based
on the data we're getting today.
So that's a this is
something I wanted to mention.
I think it's really cool. And then when
I look into this this feature space.
Right. I have one.
And once I've processed the sensor data,
what I have is reconstructed particles and
various attributes that these particles had.
Again, going back the direction in which it
was going, the charge it carried, its mass,
its momentum. All of
these different attributes.
So that's my feature space, essentially.
I mean, there's a lot of other metadata as
well, like, you know, what was energy in this
collision? How head on, did the collision,
how head on was the collision?
So you can when you're if you imagine
accelerating to, like, just throwing two tennis
balls at each other. Right.
They can either be completely head on and jump off
or they can scrape and so on and so
forth. So that becomes an interesting
aspect of this problem as well.
So I have all this data.
My goal is to see if through the this
and these data and find my Upsilon particle.
So it's it's like finding a needle in a haystack
in terms of how how much data I have
available. And the right signal is what I do a
lot of the time in terms of the selection
cuts is I will look at these various
attributes of different particles and I will see
where I can draw my analysis box.
I don't I don't want to swim in a swim
in this massive amount of data and just, you know,
go move around in a random walk,
too, until I hit an upsilon.
What I want to do is systematically,
strategically figure out what my most effective
analysis box is going to be, where my upsilons
are hiding, in which area of the feature
space. So I'll put selection cuts on let's say
I have muon, muon is a heavier electron,
essentially that I've reconstructed and I'm able
to pair it with another one, another
muon and say that OK, this two
muons probably came from an upsilon.
So what I, what I'll do is
I'll play around with the momentum range.
You know, the mass range or something like that
until I see that there are other way.
So there are ways in which
I can classify my upsilon.
right. The rules phase clustering that
I talked about a little bit.
So basically I'll say if these checkboxes
are checked, then it's an upsilon.
So I will have a hold those attributes constant
and then pick one attribute that I'm going
to play around with, that I'm going
to sort of change my value.
The cuts that I'm going to place in order
to define this box, I'm going to move.
And it's like a knob.
It's like turning a knob. I'm going
to move it around and see.
Well, statistically speaking, how does turning
this knob affect whether I'm identifying
this particle or this event as having produced
an upsilon as opposed to just other things
that might look like an upsilon in
certain ways or just be noise.
So that's that's what I really
mean when I say selection cuts.
It's fine tuning the whole space of my
analysis so that I'm not just randomly looking.
Harpreet Sahota:
That's really, really interesting.
And just like you, like feature engineering.
So my favorite part of the process, I was
wondering if you can maybe share some tips with
our audience so that we can be
more thoughtful in our feature engineering.
And maybe if you're able to provide us
like an example of how you're doing feature
engineering in your work.
Santona Tuli:
So the way I approach it is that I know
very little and I want to learn from the Data.
Of course, there are assumptions that
I do go in with.
So if it's a particle physics problem,
I do bring in my domain knowledge.
If it's an NLP problem, then I
do know something about how languages interact.
But when I look at a particular dataset, I try
to learn as much as I can from the data.
First and foremost, even before I start thinking
about, you know, what my features are
going to be and what my
ML algorithm is going to be.
So for me, this includes slicing, slicing and
dicing the data, looking at lots of plots
of different variables against each other, looking at
their correlations and so on and so
forth, even studying why certain features might
have null values and why others don't
know how to sort of interact with that.
So that that's sort of where I begin.
And then as I as I learn more from the data,
I'm able to say, OK, this is a feature that
makes sense for my ultimate goal.
Right. So that's the other aspect of it is
I should always have in mind from the
beginning what I want at the end.
So that's that's also part of the holistic
sort of decision process, is knowing not not
trying to manipulate the data in order to get
a result that I want, but knowing what I
wanted to test, knowing what a positive result
would be and what a negative result would
be and what a null result would be.
And that's when I start thinking about
the features and context of that.
Like really which of these features are
what combination of these features is really
going to add value to that fundamental question that
I'm going to answer at the end and
just set a high level of these are the
ways in which I approach feature engineering and
why I enjoy it so much, because it's it's I
feel like it's where you get to be the most
creative. You get to exercise those muscles,
those muscles more the most because ML
algorithms. Yes, you can are pros
and cons to different ones.
And you can definitely you know, that's also fun
sort of figuring out which is the best
fit and so on and so forth.
But the math is sort of fixed, right?
You get to again and get to play with
knobs, but you don't really rewrite any fundamental
ML algorithm when you apply it.
You're just, you know, fit predict.
But so the really creative part of the process
to me is figuring out what features are
going to help me. So that's how
I tend to think of that.
Harpreet Sahota:
I absolutely agree with you. Like feature engineering,
I think is only really limited by
your own creativity.
How do you view data science?
Do you view it as an art or a science?
Santona Tuli:
Both. Definitely both.
So the science versus arts divide often
boils down to procedural versus creative
processes. So like a science is sort of regarded
as, you know, you take step one then you
do step two and so on
and so forth is very procedural.
Whereas with art you can be, you know,
sort of more creative and randomly swing your
paintbrush. But of course, there is plenty
of creativity in science and plenty of
procedure in art as well.
So it's not really a fair divide.
Right, because you can you can be as creative as
you as you want with science and even an
art, you have to if you're trying to paint
a picture of let's say you're drawing a
portrait, there are rules about how
you go about doing that.
You can't just start, you know, arbitrarily.
So, yeah, for me, it's it's very, very
much a mix of both art and science.
I mean, it's called Data science because
there are certain scientific techniques that we
often use. It borrows heavily from a decision,
science from statistics, which is math and
science. So there is definitely
that aspect of that.
But the thing that we just talked about,
right, with feature engineering and being able
to imagine what how the Data is going to
tell a story, not strangle and not, you know,
like wrangling it into telling a story that
you have preplanned in your head, but really
showing the creativity in bringing different pieces
of the data together to tell its
story. I think that's
a very creative process.
Harpreet Sahota:
How does the creative process come
to life in Data science?
Santona Tuli:
So for me, the creative process
is thinking outside of predefined paths.
So being able to step back from approaches that
are known to work and come up with
approaches that haven't necessarily been tried
before, but that could work.
And then, you know, it will be
so cool if they did work right.
This process that you just thought of
this approach you just thought of.
So, you know, trying that like trying to
apply that to your specific Data science project
and just seeing it through.
Whether maybe it doesn't work out quite
the way you thought it might.
But just being able to step outside
and think of alternative approaches, stepping outside
the predefined paths. To me, that's how the
creative part of my brain is really engaged
when I'm doing Data science.
And the other aspect of that
is in science in general.
We do a lot of jumping around,
then thinking about different approaches, approaches and
trying to pull them together.
Right. So we learn we're trained in how.
So this is a little bit redundant.
It's just it's just emphasizing the
same point that I made earlier.
But so we learn we're trained since we're a
little on the history of of how knowledge has
evolved and how people have thought about things
and why certain ideas were good and why
certain ideas failed.
Right. When you're problem
solving, it's about you.
You're the one who is
thinking through this problem.
You have these guidelines based on what you've
learned and what you've seen other people
do. But in the end, the canvas is yours.
So that's you get to have as much of
a creative impact on this particular project as you
want.
Harpreet Sahota:
What's up, artists?
Be sure to join the free, open,
Mastermind slack community by going to
bitly.com/artistsofdatascience. It's a great environment for
us to talk all things Data
science, to learn together, to grow together.
And I'll also keep you updated on the open
biweekly office hours I'll be hosting for our
community. Check out the show
on Instagram at @TheArtistsOfData Science.
Follow us on Twitter at @ArtistsOfData.
Look forward to seeing you all there.
You're featured in an IMAX movie, the first
movie star I've had on my show.
Tell us about this movie
that you're a part of.
Santona Tuli:
Such an amazing experience.
So the vision of the film was to
depict scientists doing science, real scientists doing
real science. So my PhD advisor was asked to
be on the advisory board of this film funded
by, among other sources, by
the National Science Foundation.
And at this point, many, many other and
so CERN, the LIGO experiment, the Perimeter
Institute, UC Davis and various other
educational institutions sort of rallied together
to fund one of the advisors was my P.I..
And as the team started to develop the
narrative, the creators of the film got
increasingly excited about the idea
of having the atom.
And so the fundamental particle, the atom to
be the protagonist, which is a really
interesting approach. And you don't have
a person as a protagonist.
You really want to study
the journey of this atom.
And so that's where it began.
We went to. So once they figured out exactly
what our research group works on, there are
like, yes, this is this is
what we want to follow.
So it's me and a couple of my team members
and then a couple of the other people from my
research group as well. We all spend time
at CERN while the data is being collected.
It's very much a team effort.
Very collaborative.
We're all there, you know,
staying up late together.
I mean, not to year around, but when when
our collisions are happening, it's a it's a
very hands on and, you know, all
hands on deck kind of process.
So they were like, that's perfect.
We want to be there and just film you
as you're doing this work when making a film.
You have to make some adjustments, like they put
these lights sort of on top of a
computer so that our faces would be
lit up and all of that.
But at the end of the day, they
were just filming us as we weren't.
So all of the excitement that showed up on
our face, you know, the curiosity, all of that
is authentic and real.
And that was extremely rewarding for
us to be a part of.
And then they interviewed us.
So we did the whole green screen kind of talking
about our journeys and how we made it to
CERN. And you know what
we're really hoping to answer.
That was one aspect of it.
And then at the end of the Data taking
period when we were shutting down the experiment,
we always go to a bar and
just, you know, hang out and celebrate.
So they wanted to follow us there as well.
So there is some shots that were
just like, yeah, this went really well.
And we're just drinking and saying that.
And so it was very cool.
And the whole objective, as I as I look
back on it now, picking the atom as the
protagonist, following us as we collected data at
CERN, all of that really feeds into the
eventual goal of this film and this project, which
is the audience should be able to look
up at this screen and see themselves reflected
in it, being able to understand that the
physics that's going on, it's not about
going into the depths of the physics.
Of course, we don't do that
and this in this film.
But just to understand that physics
is very much within their reach.
Science is very much within their reach.
I hope that the audience will
see me and think, oh, yeah.
I mean, there's nothing
special about that girl.
She's just, you know, had this has this background
and is doing this really cool stuff, s
o I can too.
Harpreet Sahota:
That IMAX movie is called
Secrets of the Universe.
And when does that get
released or is already released?
Santona Tuli:
It's called Secrets of the Universe.
So it has premiered at
the Smithsonian in D.C.
We we all got to go
to that red carpet screening.
And that happened last summer.
And since then, we were actually supposed to
have another red carpet screening here in
California in April, but it
got canceled because of covid.
But very soon, I mean, when things get
back to normal, we'll have that second screening.
Harpreet Sahota:
I'm looking forward to a chance
to watch it out here.
You speak about this a bit earlier
about the need for interpretable models.
I saw a really well-written post from
you on LinkedIn around interpretable and
explainable machine learning and how they're different,
which might speak to that point.
Santona Tuli:
Yes, this is something I've been
thinking about a lot recently.
So to me, the distinction is an
explainable machine learning model can be explained
before the fact and interpretable machine learning
model can be interpreted after the
fact. So let me divert a
little bit deeper into that.
When I am building a model to extract
some particle physics phenomenon for some nuclear
physics phenomenon because of things we've already
talked about, it's very important that
I can explain my choice.
We have so many levels of reviews within
our collaboration because we really don't want
to make any claims that we can't take back.
So, you know, by contrast, like, in industry,
of course, it's it's more sort of it's
driven a lot of the times by
bottom line, like the actual revenue.
Right. So if you if you
do something wrong, it's terrible.
You might have lost, you know, millions or
billions of dollars or whatever, but you can
sort of change learned from that and really
change your approach and hopefully make up
for that. In our field, it's I
mean, it's, of course, still true.
Where we're learning about something.
So we don't. It's not like
we can't ever say that.
Oh, yeah, we were wrong.
We scientists go back all the time.
But even even with that being true, we really
have a lot of imperatives in place to make
sure that we really understand
the things that we're claiming.
So that's why explainable part
is really important to us.
So even before that, my model to Data I
have a good sense of what the different
parameters are going to represent, why I'm
setting my certain hyper parameters to certain
values and so on and so forth.
The other end of it is I apply my machine
learning model to Data and I get some results.
Maybe it's a classification.
I have my records classified and then I can
also look at the importance of the various
features. Right. So which features played a
part in this decision which were most
important and which were at least important.
And look, being able to look at that
and understand why certain features ended up being
important in this decision, even if when we
started we didn't really know the why, it
might be too late to be able to interpret
that and understand that this to me is
interpretable machine learning.
That's that's a way that I see it now.
Of course, they're not exclusive.
These two things are not exclusive.
And they also don't like
one doesn't imply the other.
So just because starting out I have a explainable
model doesn't mean that at the end I'll
be able to interpret my results necessarily.
Similarly, the other ways, oftentimes when
I talk about explainable machine learning,
people will sort of think that I'm saying
interpretable and think that I'm making a
statement about why traditional machine learning
is better than deep learning.
But that's not the point at all.
Deep learning can be highly interpretable.
You may not be able to explain exactly
why or exactly how something is going to.
It's some deep learning algorithm is going to
work ahead of time based on your data.
But afterwards you may be able to, you
know, interpret the heck out of it.
So they're they're really not the same thing.
They don't imply each other and they
can simultaneously be true or untrue.
Harpreet Sahota:
And another thing you've been
thinking about is decision science.
Can you show your thoughts around that first?
Can you maybe help us understand the
distinction between decision science, data science
and what have you been thinking about?
Santona Tuli:
To me, data science is a part of decision
science, but decision science of some is in some
sense a bigger like it's it's more of
an umbrella and Data science fits underneath that
umbrella. One of the ways we can
distinguish those two is quantitative and qualitative
aspects. So actually it's not a distinction.
It's more that data science tends to be
more quantitative by definition, but by nature
decision science comprises both that quantitative
part and a qualitative approach where
you're making value judgments.
You are as a decision maker, you are
thinking through things and not just relying on
data. You're not your you have to be able
to interpret what the data is telling you.
And based on that, you make decisions in
some sense calling it qualitative isn't even
that fair, because there's so much science
that goes into into making good decisions.
Right. So, again, these are all very fluid.
Science and art are so fluid.
So as a decision, scientists have to
check so many different sources of bias.
And that's why I've been reading up and
learning a lot about the decision science
recently, because, again, feeding into what
we started this conversation with, mitigating
our biases, being able to make sure that
our decisions and our processes have a positive
impact going forward.
It's a picture that you can only paint once
you have all of these pieces in place.
And when you're making a decision apart from Data
science that you tend to rely the most
on is sort of the statistical significance.
So you have done some analysis on Data
and you're claiming to have seen some phenomenon
with certain confidence level.
And that's very important to
the decision scientist, right.
OK. Not only what you're claiming to have seen,
but also the confidence you have in this
claim. And that's going to factor in
into the decision, scientists outlook on this
particular decision.
And only when they incorporate that in with
their sort of decision framework and having
thought through all the different biases, like,
you know, confirmation bias, et cetera,
that they could have they when once they
had sort of put everything together, are they
able to make good decisions?
So what we can do is Data scientists, even
if we're not making, you know, big decisions,
even if we're not the ultimate decision maker on
a certain project, what we can be doing,
how we can be getting better at Data
science is by incorporating some decision science
into our work. The way we can do that
is by thinking about the whole Data science project
and to end when we start.
We have to think about the end goal and we
have to do with we have to like put ourselves
and the decision makers shoes and
test all of our assumptions.
Does it make sense that I'm assuming this about the
Data when this is the thing I want to
decide using the Data like, am I
making a fallacy and make this assumption?
Am I twisting my results
at the very beginning?
Because I can't just it
can't just be about numbers.
I can't just, you know, give someone.
OK. This is what I'm
seeing, X percent, Y percent.
And they go and make this. I have
to be a conscientious person as well.
And think about what that number means.
So thinking through that process is, to
me, the best way a Data decision.
The marriage between Data and
decision science can happen.
And the other aspect of this is that if
these two individuals or these two rule, are
completely separate Data science and decision decision
science, there is no back and
forth and there is no collaboration.
There's no iteration of the process.
So you might be working with a
decision maker consistently year after year.
But if you don't have a back and forth,
you'll be making the same mistakes you've been
making. Right.
So the data scientist has to have a way
to speak to the decision and the decision
scientists have to be able to.
Has to be able to sort of look at the
data to inform they're not just like the final
result of the data, but the
data in it in its entirety.
So that's that marriage, I
think, is extremely important.
Both sides should be able to talk to
each other, collaborate with each other and iterate
on the process to get to better,
better processes and better understanding from Data.
Harpreet Sahota:
Switch gears here now and try to pick your
brain on another couple of things for the
people out there who are trying to break into
Data science that maybe they feel like they
don't belong, but they don't know enough or
they don't think they're smart enough or
they're just intimidated by everything
you have to learn.
Do you have any words
of encouragement for those people?
Santona Tuli:
Yeah, to the same thing I say to anyone who
is trying to break into anything new and lacks
confidence, which is that I genuinely believe that
it's not a capability thing, that it's
never a capability thing.
If you wanted to do Data science or science
or tech or coding or, you know, language,
learn a new language, you absolutely can.
There are some skills that you'll need to
pick up, but we're used to that.
We have been learning new
skills every day since birth.
And beyond that, what else would you
need, especially to break into Data science?
Did you meet other Data scientists to observe
how they think, what they do, how they
interact with each other?
And to me, that's the real skill set.
Earlier this year, I started thinking about
product management in addition to Data
science, because I like sort of taking holistic
approaches to things and really sort of
strategizing from the beginning, you know,
interacted with some product managers.
I tried to pick their brain credit, get
coffee with them and and really understand what
this field was.
And one thing I kept hearing over and
over is think like a product manager.
But what does that mean?
Now, having spent more time sort of thinking
about it and learning from product managers,
I, I understand what that means.
And I have the same advice for Data
scientists used to think like a data scientist.
And that's not just a Weismann saying.
And that doesn't really have meaning.
It has absolute absolutely has meaning.
The way you can achieve this is by
meeting data scientists, observing them, learning from
them, thinking how they think,
watching how they speak.
The reason I say that is it's not the speaking
may not be an important aspect of being a
data scientist, but it definitely gives you insight
into what's going on in their head,
like the content they put out there on
LinkedIn, for example, or various other sources.
I think that's that's one of the best things you
can do to pick up this sort of the more
nebulous skills around Data science beyond the
more harder skills, so to speak.
Harpreet Sahota:
What does it mean to
think like a product manager?
I think a lot of our audience
would love to hear your take on.
Santona Tuli:
So I took this sort of short program.
It's called She Aspired.
So I guess I'm giving them a shout
out now because I have the platform.
But so it's really close
part of the first cohort.
And it was a number of people who wanted
to either break into product management or just
learn about product management and what we did
in this program is we approached sort of
the transition to product management
as a product itself.
How this helped is.
It's hard when you talk about things in
a nebulous way like you, someone might advice
you, OK? You want to be a
product manager at X, Y, Z company.
Go pick a product that they actually have out
there and then see how you can improve it
and maybe write up something on that and then,
you know, try to sell yourself in that
way, which makes a lot of sense.
But it's still difficult if you
don't know what product management is.
Right. You can't just go and pick up
a product and, you know, tear it apart.
But so by having a more tangible product that
you're working on, AKA your career or your
transition or you can think of you're breaking
into Data signs as your product and then
you dismantle that.
Right. You can sort of really take it apart
and think how different parts of it play with
each other. Like, so we were just talking about
the skills that you need to break into
Data science right to get
that data scientists job.
So being able to separate or distinguish what the
end goal is and the steps that you need
to take in order to get there.
So making the roadmap, which is that's a
common term used in product management, setting
the key objectives and then sort of having a
timeline for it for those or having these
checkmarks about like, when do I say that, OK,
this aspect of this of this project or
product is done. And then I can
move on to the next next part.
So to me, product management is very much as is
very dependent on how you can how it can
break apart a bigger project, a bigger
idea into smaller bite sized chunks.
And then you can strategize whether you can
complete them in series or in parallel.
And, you know, you have
to pull your resources cleverly.
If you let's say I'm working on something now
and I have this resource that helps me do
it. And then tomorrow I'll work on something
else and it might require the same resources
if I had noticed this before,I might be able
to pull that resource at once and work on
these wanted tasks, one and
two at the same time.
And that would take me a lot
of time and make it more efficient.
So thinking like a product manager or
being a product manager is really about
strategizing and figuring out the most efficient
ways of approaching a problem or a
product and being able to also have the
flexibility to sort of iterate, learn from what
you're doing and keep applying for it.
And truly, I truly believe that that applies
to so much in life as well.
Harpreet Sahota:
Very well put. And I think we can all
end up being product managers after that really
great explanation. I like what you said.
Observing data scientists not necessarily downloading
their brain, but absorbing the way
they think about things, how they
tackle particular projects, particular problems, how
the vocabulary they use.
So a lot of it is just developing the
right mental models for yourself so that you can
apply those mental models
in the right scenarios.
I think, you know, a lot of scientists are
out there working on projects and they might
feel like a bit of fear or
hesitation trying to make the project perfect.
Right. They don't release it until it's
perfect, whether that's professional or personal
project that they're working on.
Do you have any tips for anyone who
is in that type of mindset to.
Santona Tuli:
Try to get over it?
I used to be a perfectionist
when I was in college.
It's there's still some aspects
of that that I carry.
I can really get hooked on as one thing
and sort of spend time on it sometimes.
But I've also been actively
trying to not do that.
So get over this idea that it has to be
perfect before, you know, I push it out or send
it or or what not.
So I would just try to give the same advice
that I give myself to anyone else out there
who is struggling with this.
Just put it out.
The worst? What's the worst that can happen?
Maybe someone criticizes in some way.
You know, I'd never be heartbroken about
purely negative criticism, purely, you know,
like some criticism that does
isn't meant to help you.
But it might turn out that this criticism
that you're receiving on it is actually going
to help you iterate on that
project and make it better.
And you would have never gotten that feedback
if you didn't actually put it out there.
So, yeah, just just put it out.
Ask the people that you are most you'd, feel
most embarrassed about if they were if they
thought that it wasn't good enough.
Like you have in your head. Right.
Like, oh, it's this person sees it and then
they're going to think that it's not good
enough. And I'm embarrassed. So just go and ask
them, you know, build a report, talk to
them. People love being asked to like do things
in the sense of like help you out to
like, look over your projects,
look over your resumes.
And there's so many kind people out there
that are genuinely interested in helping in
that way, you know, build that relationship, you
know what, they'll they'll take a look
at it. Absolutely, and they're they're never.
They're not going to be mean about it.
They're going to give you that feedback.
And all of a sudden, you know exactly what you
need to do it to make exactly that person
or that kind of person think
that the product is as good.
So our project is good.
So it's very much a the more you step
out of your comfort zone, the bigger your comfort
zone will get. And the more feedback you'll get
on your work and you can keep iterating
on it.
Harpreet Sahota:
So as someone who's a recovering perfectionist,
sometimes when we're working on a project
and we're getting some feedback, we're getting some
criticism, we might feel down or feel
like a failure. We might want to give up.
So what can we do to kind of
get over that, get through that feeling?
Santona Tuli:
For me, it's a couple of different things.
One is just being able to step away, take a
break, take a breather, and that can be long
or short. But that really helps me refocus
and sort of just it puts things into
perspective. You know, the thing that you're sort
of sweating over, you know, staying up
nights and working on it may
actually not be that important.
So it sort of helps put things in focus.
And that went into perspective.
And then the other aspect is just keeping at
it, which might seem like intention with the
first thing. When you're working on something
and you're not fully confident about it,
rather than just getting disheartened.
Give you some self, some slack, figure out what
it is about this that is giving you that
anxiety or that stress.
What's causing it? And, you know, face it head on
and really dig at it until you are able
to beat it. So that's so to speak.
So you can always you can take over any hurdle as
long as, you know, if it has to be a
worthwhile hurdle.
But once it is, you can definitely do it.
Having that confidence in yourself and
in your process is very important.
Harpreet Sahota:
Because the most uncomfortable part about that would
be just sitting there with the fear,
digging through trying to understand.
What is causing it?
On the other side of that, that's where
the most growth happens, right where you're
sitting in that uncomfortable phase.
you talked a lot about, you know, the technical
skills that are needed to be a data
scientist. What are some soft skills that
you think Data scientists are missing?
I'll use the word soft skills
as well, because that's often used.
But I do want to just quickly say that
I think that these are all skills like as
important as it is any of
the quote unquote, hard skills.
And I would if it were up to me,
I would just rephrase it all as a skills.
But I know what you mean.
So the top ones that
come to mind are communication.
So I think being able to communicate clearly, it
doesn't mean it has to be in perfect
English or in perfect sentences or that you have
to express all of what's in your head in
one go. But it does mean that you
are comfortable talking about what you're doing with
others. Communication is a two way street.
It's not just about talking at someone.
It's about giving space,
listening and taking feedback.
I think that really, really helps.
It helps with a lot of
processes, but especially with Data signs.
I think when you think through things like, you
know, we have the rubber duck idea of,
you know, talking at the mirror
and so on and so forth.
Santona Tuli:
It is when you talk out loud, speak
with someone about something that you're working on
and that you get the smartest ideas, brightest
ideas, or you realize why some approach
you're trying isn't great. So that's
that's a huge one for me.
That relates to the second one directly, which
is presentation skills, I think are very
important. So, yes, gradually honing that skill
of expressing yourself and what you're
working on, getting some buy in.
This is like one way of putting it is
when you're sold on your idea, learning the skill
set to sell others on it through presenting.
And that has hard and soft
aspects to itself as well.
Your actual data visualization, your actual slide deck,
et cetera, is a part of it.
And also how you present yourself and how
you speak about what you've been working on.
Those two things are extremely important.
Let's see, soft.
Yeah, I mean, I feel a little bit
like it's emphasizing the same point again, but
triaging is another really important
one, rallying people, building, getting.
Getting people to buy into your ideas so that
you can maybe push it to someone, you know,
at the skip level or the
C suite level or something.
So like being able to have that engagement
and being able to triage people into basically
sharing your view with others such that they can
actually see it from your point of view
is an extremely important skill.
Harpreet Sahota:
Yeah, definitely influencing type of skills.
I guess another way to put that.
I totally agree with you. Like soft skills,
like don't really like that name either.
I think they're really the hardest skills because
these are skills that you can't nobody
can teach you these skills.
You have to learn them yourself.
You have to actually learn them
for yourselves through experience, but through
experimentation and, you know, through self reflection
and trying things that works and
what doesn't work and
again, being uncomfortable.
I was wondering if you could speak to experience
being a woman in STEM and if you have
any advice or words of encouragement for the
women in our audience who are breaking into
tech or currently in tech.
Santona Tuli:
The first thing I would say
is that I feel you.
I know that it's hard.
It's always hard to be the only person in
a room that is different in some way.
But if you don't push through that, then you
will always be the one person in the room.
And it's only when you're able to
overcome those insecurities and make a environment
that's more welcoming is what you're gonna get
that second person in the room and then
the third person and so on and so forth.
So, yeah, I mean, I think there is a
general recognition that diversity of all sorts is
important and good and ultimately
helps everyone's bottom line.
There's still some.
Of course there is.
There's a lot of pushback as well.
And, you know, we have systems that work in
certain ways, but find yourself a network of
people who believe in diversity and in
encouraging and supporting people of various kinds
of minorities, be it gender
or race or whatnot.
And that network really that support system, if
you're able to build that up, that really
helps. Even if in that room you don't have
that other woman in tech, if you're connected
to women in tech through other organizations
or elsewhere, then you can reinforce your
beliefs and they will
support you through that.
So I'm part of a women
in machine learning and data science.
It's a nonprofit organization.
We have a strong network of women.
Same thing with this this product
management program that we did.
We now have that cohort that a connection, a
community of women were there to support each
other and to help these overcome these barriers
that are that we see across the board.
I think that that's a big one.
And I've done that for myself.
Finding that community, those communities,
it's not just about one.
It's sad that we have to resort to this.
But that's that's definitely an aspect of it.
Just a different way of looking at it.
I have.
I grew up in a family who likes lots
of encouragement and no distinction ever based on
based on gender. So I never walked into.
So I went on to major in physics right I
mean, I was first of all, it was a small
physics, cohort. Not too many
people go to physics.
But like there are even fewer women, of
course, in my in grad school, my graduating
class, I think were really seven out
of thirty five or something women.
So it's those numbers are staggering and they
will exist, you know, further into the
future. There are not vanishing tomorrow.
But going in, knowing that you're equal to
everyone else, you're just as good, if not
better. I mean, there are definitely things that can
that can make you put you apart from
from other people. Maybe, you
know, your New to.
Of course, you can be a woman, let's say.
And the smartest one and a physics graduate
program like that is a very true reality.
Sometimes, oftentimes.
And so just being able to walk into
those spaces, having that confidence in yourself, it
doesn't matter what you look like, what you
know, what the other what the population is
made of. You have to have confidence and faith
in your ability in the thing that you're
trying to do, be that Data
science, tech, physics, what not.
That's the only thing that matters.
The system that's in place does not matter.
And we're working every day to
try to overcome those barriers anyway.
But in the meantime, you have to
be confident about yourself and your abilities.
Harpreet Sahota:
Thank you so much. This
is a very empowering message.
I'm sure audiences really took
away a lot from that.
Thank you so much. So talk to us about
the My Hero Award that you were recently awarded.
How do you hope to be
a hero for women in STEM?
Santona Tuli:
That is a dream. This award was
for a short video that we did.
Partly the goal was to have promotional video
for Secrets of the Universe, the IMAX
movie, we decided to do it because they wanted
the audience to get more than just me and
a few shots during the movie, but also
like understand my story and my journey.
So it's this video is played before screenings
of the movie and will continue to be
played on. The movie is out more officially
before the screening of the movie to set
context around who I am in this movie.
And so that's it's really the
thing that's meant to frame it.
So in this video and you can look at it,
you can find it on the Secrets of the Universe
website. There's character profile on me
and the videos included there.
So I just talk about yeah, I talk about where
I come from, my journey into physics, how I
went into CERN, what it means.
What we're doing there is it's it's
just a frank conversation about me.
And I think because I am a woman and because
I'm a person of color and it potentially has
even broader reach.
So that My Hero Film Festival is a
yearly festival and they try to recognize individuals
and groups that are setting an example
and somewhere another something that other people
can sort of aspire to or be inspired by
and, you know, sort of, again, see themselves
reflected in that. So they picked it in the
sciences category as one of one of the top
submissions or whatever you want to put it.
So that was that was really cool.
I never expected, you know, as a physicist.
Yeah. I mean, getting to go
to CERN was super cool.
Getting to be in this IMAX movie.
It was. Well.
And then on top of that, you
know, this video gets this award`.
So it's yeah, it's it's definitely very cool.
I'm very, very appreciative of all of
this and very happy about it.
But at the same time, the goal throughout all
of this has been about reaching out to
people and showing them that
it is very doable.
You can do it if I can do
it, and you can do it, too.
And you should definitely be thinking that.
I'll share one quick thing as well.
I just remembered about this movie.
So the second screening that I was at,
high school, middle school kids were were included
in the select audience group.
And I was sitting I happened to be sitting
next to this 11 year old or so.
I mean, I didn't ask him how old
he was, but, you know, looked about eleven.
And so at the end of the movie, he just looked
up at his or whoever he was with, and he
was like, mind blown.
And to me, that was
such a like empowering moments.
Like this kid is getting inspired by this.
Getting inspired by the
physics, by the people.
And so that is extremely rewarding.
And I just hope that more and more people and
more women are going to watch this and be
like, yep, I'm gonna go do that next.
Harpreet Sahota:
What can we do in the Data community?
What can men do, in particular in the
Data community, to help foster the inclusion of
women in STEM, in tech and Data?
Santona Tuli:
People are learning to be better allies.
There's, of course, a long way to go.
It's difficult to have conversations about supporting
minorities in tech because it often
gets intertwined with ideas of productivity and
intelligence and things like that, which
is of course, extremely sad and it's part
of the systemic problem that we have.
But, you know, like because of the way
that these fields have been dominated by a
homogenous set of people, there is this idea
that anyone else is somehow inferior and
doesn't belong to this to this field.
So these conversations are tricky.
But the only way that we can
move forward is by having them.
I mean, a lot of people don't know about
the biases that exist around them and how
they're propagating them.
They don't understand that, you know, standardized
tests like, you know, just as an
example SAT or GREs is systematically discriminating
against, you know, women or people
of color. These are facts that I think the
more we write about these, the more we discuss
them, the more awareness we
can spread about what exists.
And, you know, it's,
again, a privilege thing.
Like, you don't you don't realize what your
privilege is, but your life has been set
apart from the day you were born.
From others because of certain factors.
They relate to just, you know, being able to
spread awareness on that, I think can go a
long way to my male colleagues.
I think the one thing I would say is
never, ever go in with presumptions about your
female colleagues. I think most people will be
like, oh, yeah, I'd never do that and
stuff. But I think a lot
of these can exist subconsciously.
So just think to yourself, take a moment to
think like, okay, I'm about to review this.
This person, this female colleagues code, am
I subconsciously or am I thinking something
already about what I'm going to see?
And if you are, then just stop and, you
know, maybe revisit or maybe excuse yourself from
from from doing it. If you think that you
can have that harmful effect, the reflection is
always a very positive thing.
Harpreet Sahota:
Thank you for that. So last question.
Who briefly jumped into a lightning round?
What's the one thing? People
to learn from your story,
Santona Tuli:
You get to shape your path.
You get to do what you want to do.
If you're curious, if you are excited and wanted
to try things and want to learn things,
things will fall into place and
you will figure it out.
And that is not to say that
I haven't had set backs or failures.
It's just about how you look at it and
it's about getting back up and keeping going.
Harpreet Sahota:
I love it. Let's jump into
a quick lightning round here.
What's your data science superpower.
Santona Tuli:
Curiosity.
Harpreet Sahota:
What would you say is the most fundamental
truth of physics that all human beings should
understand that physics is a model, that it's
approximations, that the whole idea is we
try to explain why things are
happening, but we don't know anything.
So what do you think is the
most mysterious aspect of our universe?
Santona Tuli:
I think that the physicists I have to say
dark energy, but personally, I think that it's
really cool that we exist, that, you know,
this small planet has the right conditions for
human life to exist and we're able
to ponder on our own existence.
I think that's mind-boggling.
Harpreet Sahota:
So it's an academic topic outside of Data
science that you think every data scientist
should spend some time
researching or studying on.
Santona Tuli:
Social behavior, maybe human interactions.
Harpreet Sahota:
So what's the number one book?
Fiction, nonfiction?
Or if you want to pick one of
each that you would recommend our audience read.
And what was your most
impactful takeaway from it?
Santona Tuli:
I think even to lead, my favorite
book is 1984 by George Orwell.
I read it very early and I
was very sort of enamoured with it.
What I do like about or well is he
has some essays on communicating and writing, and
those can also be very, very helpful in
figuring out best ways to communicate things.
So that's more of an endorsement of
an author than a particular book.
Most recently, I read a book about or set in
World War II, Germany, which to me was it
was a really cool way to observe something
that we learn about in history books.
But this was a like
it was a narrative, right.
So another way of what I'm trying to
say is historical fictions based in historical
context. I really enjoy it.
And maybe others can benefit
from from that as well.
It gives a human touch to something that
actually happened and people suffered and then
so on and so forth.
Harpreet Sahota:
So if we can somehow get a magical telephone
that allowed you to contact 18 year old
Trenton, what would you tell her?
Santona Tuli:
Keep doing what you're doing.
You can do it.
So I you know, the thing when you look back
on what you did, like six months ago, you're
like, oh, my gosh, I was so stupid.
So that that happens and that is real.
So there are, of course, things that 18 year
old Santona did that I would very much be
like, oh, my gosh, what?
But no, the real thing is
that things turned out fine.
Right. Things turned out well.
So just keep doing what you're doing.
This is the same advice I would have for any
18 year old is sort of like have faith in
yourself. Be curious and learn
and do what you do.
What motivates you learning?
It's a bit cliche, but I am
excited to learn more in different fields.
So it's it's something that yeah,
it ties into curiosity and learning.
It's not just about OK, honing my Data
sign skills or my physics skills or something.
It's about learning what other
people are are thinking about.
And it's really exciting.
Harpreet Sahota:
So what song do you have on repeat.
Santona Tuli:
I really like Resilience
by Raising Appalachia.
It's, it has a very strong message.
I listen to it almost almost every
day, especially in times like these.
A couple of the lines are we are resilient.
We trust a movement.
We negate the chaos, uplift the negatives.
Harpreet Sahota:
I definitely have to check that out.
So how do people connect with you?
Where can they find you?
Santona Tuli:
I am available on LinkedIn.
That's probably the best way.
Just my name Santona Tuli.
That's without an H.
Although you pronounce it Shantanu,
it's Santona its spelt Santona.
Yeah. So LinkedIn is this
best way to reach me.
Please connect with me. I
love meeting new people.
Harpreet Sahota:
Dr. Tuli, thank you so much for taking time out
of your Scheduled to be on the show today.
Really, really appreciate having you here.
And I know there's so much here
that our audience will learn from.
Thank You.
Santona Tuli:
True pleasure is so much fun.
Thank you.
