Hello everyone and welcome
to the NAGT webinar series.
Today's webinar is titled, Teaching Landslide Analysis
to Undergraduates: Planning for Failure and a Safer Society.
And is sponsored by GETSI.
Please take a minute to review the zoom controls
on the screen.
We ask that you leave your microphones muted
and video cameras off.
If you have questions and comments along the way,
we encourage you to enter those into the chat box.
To access that chat box, find the control bar
and click on the chat button.
Webinar presenters and staff will be monitoring the chat
for your questions and comments.
The NAGT webinar series is your one stop shop
for strengthening work and earth education.
Webinars in the series featuring novel and innovative work
in earth education research and pedagogy,
new teaching materials, and the classroom
and professional experiences of people like you.
The NAGT webinar series is free and open to the public,
and we encourage you to invite your colleagues to attend
and join the discussion.
On screen is a link to the webinar series
where you can find the webinar schedule,
an archive of past events and information
on our sponsoring projects and programs.
You can also find slides, resources and recordings
of each webinar, including today's
through the webinar archives.
Now I'll turn it over to Beth who will introduce
today's topic and a little bit about GETSI.
Okay, thanks Mitchell.
So my name is Beth Pratt-Sitaula,
and I'm the project manager for the GETSI project
which is Geodesy Tools for Societal Issues.
If you want to follow along
on the module itself you can go to the
serc.carleton.edu/getsi or 234626.
And if you go to the main page there
it's in the majors level section which is slightly down.
I'll be giving a brief introduction to GETSI and geodesy
and then I'll be handing it over to Bobby Karimi,
who was one of the authors of the Planning
for Failure module.
So just to give you a little bit of an idea behind
the GETSI project and this larger sister project InTeGrate
and how we situate it and why we situate it
within societal issues.
If we think about challenges facing us now,
there are an awful lot of societal challenges
that involve STEM and specific you know
and within that geoscience, be it climate change
and natural hazards, water resources,
and at the same time we're also facing
the need to engage students better in STEM
for literacy and for future workforce.
And it actually turns out that there's a complimentary path
to improvement on this.
In that if you situate the STEM learning
within a societal context,
not only do students learn better,
but they're actually more likely
to consider STEM as a field.
The GETSI project specifically has the mission
to develop and disseminate teaching and learning materials
that feature in particular geodesy data
and quantitative skills applied to these critical societal
issues that geodesy can do which is climate change,
water resources and natural hazards.
It is a sister project, a little sister project,
you could say to InTeGrate.
It's been funded over four grants now,
developing 13 modules of something like two weeks each,
introductory, majors level, classroom field
and this particular module is a classroom
majors level module.
To make sure we're all on the same page
about what I mean by geodesy.
It is the science of accurately measuring the earth size,
shape, orientation, the mass distribution
and especially how these vary with time.
Traditionally, I mean, you might have thought
of geodesy as just surveying,
precise positioning of points on the earth.
Whereas now we really in the last few decades
have this toolbox of techniques to better measure the earth.
And if you decide to unpack this toolbox,
it includes things like high precision
global positioning, interferometric synthetic
aperture radar, InSAR
which is good for regional deformation.
High resolution topography is also considered
part of geodesy and that's really
what the data source is for this module
that you'll be hearing about today
which is mostly LiDAR, airborne LIDAR,
light detecting and ranging
and also Structure from Motion
can provide these high resolution models.
Strain meters, tiltmeters, creep meters,
gravity measurements and sea level and ice altimetry
are also all part of this toolbox.
Behind each of these modules are five guiding principles.
So they need to address one or more
geodesy-related grand challenge,
make use of authentic and credible scientific geodetic data,
improve student's understanding of the nature
and methods of science.
That like how science is conducted and communicated,
and there needs to be a deeper level of involvement
within your disciplinary problem.
I think of this in terms of applying geoscience learning
to societal issues, be it policy or economics,
but something more than just like landslides,
they can knock down houses, and then on to the data,
it needs to be more deep and authentic than that.
And because of the geophysics emphasis in GETSI,
we decided to emphasize quantitative skills
as our fifth guiding principle.
All of the materials were developed
following a backwards design in that
we started with the learning goals
and moved on to more granular learning outcomes.
Then you need to think about,
Well, how would I know if these goals and outcomes
are accomplished so you determine your assessment strategy,
and then design the teaching materials
and the instructional plan to match.
All the materials were piloted by the author
and a non-author, by both authors and a non-author revised
and then published.
So this is the integrate model
of a developing assessment which GETSI has followed.
So some suggestions that we found to be helpful
for people, maybe keep some notes,
digital or paper about aspects of the module of the webinar
that you are most interested,
that interests you most and steps you would need
to integrate it into your own teaching.
So at this point I'm going to hand it over to Bobby Karimi,
one of the two authors of the Planning for Failure:
Landslide Analysis for Safer Society module.
All right, good evening everybody.
Beth, do I want to share my screen or--
Yes, let me stop sharing now, thank you for that.
And I'll just while we're,
while it's thinking about stopping.
I think should it, think you should be able to take it now.
Yes.
Then do feel free to put things
in the chat box everybody, as we go along.
And Mitchell and I will moderate that
and we'll jump in with a question
that seems really relevant,
otherwise we'll keep them to the end
where we'll have a section for question and answer.
Go for it, there, Bobby.
All right.
Thank you.
So, our module that Steven Hughes,
at the University of Puerto Rico-Mayaguez,
and I developed is looking at landslide analysis
for a safer society.
And our overall goals for this module
were to have students process,
analyze and interpret geodetic data
to identify and classify mass wasting sites
and connect them...
can start connect their development
to environmental factors as one major goal.
A second goal that we have is to bring this aspect
of quantitative analysis and modeling
to landslide susceptibility and evaluate the relationship
between mass wasting event sites,
and local geospatial factors.
And our last goal is to synthesize susceptibility models,
looking at environmental, social
and political considerations as a guide
to develop a comprehensive landslide risk assessment.
And part of what I find particularly interesting
about this module having delivered it now twice,
is that it identifies and works on skills
that industry, and students and professors
have identified that the overall graduating population
of geology and geography students
need a little bit more development.
And I've highlighted the ones that I think
we really target here.
So preparation of geological investigations,
the last of the units in this module
culminates in a risk assessment and hazard mitigation plan,
which I see as like an analog
for a geological investigation and the preparation of it.
There is a lot of adaptability,
especially when we're dealing with technical aspects
and computers, so students need to be able
to troubleshoot and adapt given different conditions,
not every student has the same operating system.
Not every student has the same level of skills
when it comes to computers
and combined with time management,
adaptability and time management
really help that student overcome a lot of challenges.
And this module really questions a lot of ethical practices,
particularly in the last unit regarding
how students approach hazard mitigation and risk assessment,
what is an acceptable risk what is not acceptable.
And then GIS and remote sensing skills
are obviously being developed in this module
as well as examining geologic processes
in a lot more detail than they may have
in a standard classroom setting.
So, in our module, we have four distinct units.
Unit one has students looking at mass wasting events
and identifying and quantifying aspects of them.
So they're tasked to use CloudCompare which is freeware,
and they can use CloudCompare to look at LiDAR data
from the USGS and LiDAR data from Puerto...
that was done in Puerto Rico following Hurricane Maria,
and in a very specific region
looking at the total volumetric loss and volumetric gain
due to mass wasting events.
And so it requires them to step forward
and overlap those data sets using common data points
within the LiDAR data sets.
And then it has them converting data,
migrating over to ArcMap or QGIS,
and then quantifying those volumetric aspects.
Unit two and three are combined,
really the landslide susceptibility modeling phase.
And unit two more explicitly is the first phase
where we examine the distribution of mass wasting events.
And unit three is the development and testing
of models of landslide susceptibility,
both quantitative and qualitative testing.
And then unit four is the analysis and prescription
of a risk assessment and hazard mitigation plan
following FEMA guidelines and USGS data sets
as well as datasets sourced from local state governments.
But for this presentation today,
I'm really only going to focus on units two and three.
That's not to say unit one and four are not interesting
or important, It's just that unit two and three,
I think gives the full combination
of the different social aspects that we want to address.
And in approaching unit two,
when we talk about distribution of events,
I thought it would be wise and it has been fairly successful
to start really further back
and having students just analyze patterns
and think about what a pattern is
and how we identify these patterns
as representations of mass wasting events
in elevation data or other satellite
or aerial remote sensing source datasets.
So, for my first question, if you don't mind
in the chat box below or wherever it may be on your screen.
Just looking at the pattern that's given here
of these boxes.
If you could guess what the six color
would be for the sixth box.
I for some reason cannot see the chat box.
It may be floating around as a pop up window,
since you're currently screen sharing.
Oh, I think, because my PowerPoint is expanded
to the full screen I'm not able to see what's behind it.
If you go up to the more on your controls
you should be able to hit the chat box,
It'll pop up as a separate window.
But I can tell you, everybody said, red.
Okay, good.
I think we'll have to go with that.
I'm still unable to see it on my end.
Okay.
So yes, the sixth box would be red.
So now let's complicate this a little bit further.
What would be the color
or what is the expected pattern here?
So, just take a look at the pattern
and rather than kind of answering what the next shape
or the next color would be,
just take a moment to think about what difficulties
you're having in identifying the pattern
and if you don't mind putting that in the chat box.
And then after a few seconds I'll let you know
what the sixth shape slash color would be.
So no color repetition.
They're saying maybe a dark red square.
Somebody, square that's light red.
Non-repeating pattern.
Okay, this is good.
This is usually more than what students are able to guess.
So, the pattern here is that it's alternating
from cool colors to warm colors
with disregard to the exact shape.
And so, what makes this a little bit more difficult
to identify is because we've now included an extra variable,
and we've given a range of variables that are acceptable
as the answer.
Whereas before it was just red or blue.
Now all ranges of warm colors are acceptable
and all ranges of cool colors are acceptable.
And then so in this pattern we've got the alphabet
but we've got letters missing.
So what letters are missing and why?
And if you'd like to type that into the chat box.
So, letters missing C, O, P, S,
so people are getting the missing letters.
Or the P is there, I guess.
But nobody has forwarded a why.
Well, this one's a little bit trickier
and it's just that in this font,
those are the only three letters
that are constructed entirely of curved components,
and there is no straight component in those letters.
And so this becomes a little bit harder for us to understand
as a pattern and perhaps for students
to pick up as a pattern, because it really depends
on the perspective you look at it.
When you look at it, just from the perspective of letters,
you get one set of results, you understand what's missing.
But you have to look at it from the perspective
of the font and how letters are constructed by shapes,
and that increases the difficulty of detecting said pattern.
So, what patterns are, I have students meet in small groups
and come up with a definition and usually
after a couple of minutes they're able to come up
that it's some sort of discernible regularity.
And the pattern in a natural system can be highly variable.
And at the very least, we hope that students understand
that there are at least three dimensions required
to describe natural patterns.
And in a geospatial system those three
are your latitude your longitude and your elevation,
or your spatial coordinates.
And patterns can be scalable.
The patterns that we look for on a large tectonic scale
are very different in size compared to the patterns
that we look for perhaps in nutrient cycling
within a small region.
So we have to think about these patterns
within multiple scales and scopes.
And so, why do we care about patterns?
It's helpful for us to predict
and that's really what I want this students to understand.
And I even argue that with them that the crux of all science
is pattern recognition, detection and prediction.
By being able to predict, we can save lives.
We can prevent damages and we can overall improve
the quality of lives.
So, who sees patterns?
And this is a tricky one.
Pattern recognition is not something we cognitively
fully understand yet.
So who sees patterns?
Is any animal or human being,
that can assign a survival value to that pattern recognition
detection and prediction task.
So, at this point it's fun to bring up this image.
This is topographic lineaments in excess of one kilometer
in length that my students identified
in for the state of Pennsylvania.
And there are over 120,000 of them I believe on this image.
And for the first poll that we have
that will come up on your screen in a second here.
My question is, is there a discernible pattern
in this image?
Okay, I think we'll end the poll there.
So, it looks overwhelming.
Yeah, it's overwhelmingly everybody said yes
there is a discernible pattern and I believe at this point
most people are likely noting the valley and rich province
that is within the south eastern quadrant of the state.
Now, if we go to look at just a specific region
in the highlighted region can you discern a pattern here
and we'll put up another poll for this one.
Okay, I think we'll cap it there.
So a little bit more variability here
but overall it's a lot harder to see
any discernible patterns and for those of you
who did see a pattern I would love to pick your brains.
I myself have difficulty seeing a pattern in that.
And the point of this for the students
is for them to realize that there is a limit
to pattern recognition that humans can do.
That we are overburdened by variability of data,
and the overall quantity of data as well.
And so it's a benefit for students to recognize this
because we can kind of lead them to the next logical step.
Well if we can't do it as humans, what can or who can?
And inevitably the answer comes down to computers.
So we have to think about computational modeling
as just an extension of what humans are capable to do.
So how we see patterns, is a very interesting discussion
to have with students.
The mental or cognitive process is incredibly complex
and very poorly understood to this date.
One of the broad definitions of pattern recognition
cognitively is the external signals arriving
at the sense organs that are converted
into meaningful, perceptual experiences.
Me and my students spend a lot of time talking about
what is a meaningful perceptual experience.
But it's important for students to understand
that pattern recognition as a cognitive process
is not fully well understood,
meaning that the extension of pattern recognition
that that's done by humans into the computational realm,
therefore still has a lot of limitations.
So the concept of an ideal or a pattern,
we discussed, whether it's deductive
and innate to the observer
or inductive one that we learn through observation
of imperfect examples typically done in a learning system
with a teacher.
And inevitably we come to the conclusion
that it's most likely inductive
though we don't disregard the fact
that there could be biological processes
that force things to be deductive in certain instances,
and those can be argued back and forth for a while.
We stopped being able to see patterns
as we've discovered through the prior exercises.
When we are overburdened with variables,
or the accuracy becomes questionable,
the scale, the efficiency or if we introduce bias.
And computers can overcome a lot of these limitations,
but the human bias is still an important player here
because we can only mimic what we are aware of.
So we're really building this connection
between cognitive processes and computational models
and algorithm development, and hopefully this gets students
to reinforce higher order, computer literacy skills
by thinking and making these types of connections.
So, how does a computer see patterns?
And this is a very difficult question to answer,
but for the most part it needs to learn
much the same way we do if it's going to mimic us, ideally,
and that's by looking at artificial intelligence
and machine learning.
But the pattern analysis of natural phenomenon requires
appropriate mathematical recognition and detection methods.
We have to have the language that computers speak,
and we have to have the words in that language
or the equations, to be able to perform the task.
The proper implementation of these methods in hardware
and software platforms is integral to doing pattern analysis
for natural phenomena.
And so that means we need to have workflows
to facilitate pattern analysis.
And we as researchers have to be well-versed
with the data types and characteristics
and have the required knowledge
on the general workings of these required methods.
And this is a time that me and my students
typically find ourselves recognizing hopefully
that a lot of, sorry,
that a lot of students really think point number three
is the purpose of the classroom experience
and they sometimes forgo that point number two
and point number one are equally important in understanding
any learned material, and any sort of pattern recognition,
analysis and prediction.
So can we predict an exact location
in which something may occur?
And this brings up a really great conversation with students
about why it's possible, why it's not possible,
and really getting them to think about susceptibility.
So when we talk about landslide
or mass movement susceptibility or earthquake
or flooding susceptibility what are we talking about?
Is it the likelihood of occurrence,
or is it an exact time where something will occur?
And we understand susceptibility to be a likelihood
of occurrence, compared to relatively other
or similar regions.
And geoscientists often use susceptibility
as a form of prediction.
And so this is a great lead in into talking about
what is the mathematical tools that we have
to do landslide susceptibility or pattern prediction?
And what we use in unit two and three
is the frequency ratio method which is a by variant method.
It's very, very popular because it's very friendly
to end users, it's very simple to run.
And the vulnerabilities to slow failures
of individual factors can be investigated.
So we can compare the presence of landslides,
to the spatial distribution of slope angles,
or to the spatial distribution of precipitation.
So here we talk about frequency ratio values
between landslides and another factor
as the landslide susceptibility index.
And oftentimes researchers use the whole area
of the landslide but sometimes this isn't suitable.
And this is a great moment to talk to students
about scale, right?
That if your pixel size for some of your data sets
is let's say one kilometer.
Doesn't matter if you've got a small 30 square meter
landslide in that or should you just use a data point.
And so, we talk about scalability
and what is helpful to use.
Now in this unit two we use data points representing
the top most elevation in the headscarf region
of each landslide polygon
rather than the full landslide polygon.
Only because it's a little easier for students
to wrap their heads around,
and the overall scale we're looking at
the size of Puerto Rico and about a quarter the size
of the state of Arizona are very large in scale,
meaning that a lot of the areas become drowned out
when we're looking at the relative pixel size of our data.
So, when I discussed the math required to do this.
To do the frequency ratio method with students
I typically step back and start from
a more qualitative approach, leading into the equation
rather than giving the equation first.
I find that students when given the equation out the gate
tend to get a little overwhelmed and panic,
and that limits their ability to understand the math.
So I start by telling them it's just the natural log
of the frequency of landslides in a factor area
or a subclass within a factor,
divided by the frequency of landslides in the total area.
And then we take that forward, and just break that down.
What is the frequency of landslides in an area?
Well, it's the number of landslides in that factor class
F-sub-i , divided by the area of that class,
over the total number of landslides in the region,
divided by the total area of that region,
and that we mathematically denote this using these symbols.
Students have mentioned to me
that this is a little bit easier for them to digest
than going into reverse.
And I'm only going off of my own personal anecdotes
on how successful this is.
But once these calculations are done for a factor class
and for the overall area,
we want to interpret what those LSI values that result mean.
Positive LSI values indicate that there is...
in fact, I've been hearing a lot of beeping.
I still can't bring up the chat window.
Was there a question?
I think that was my doorbell.
Sorry about that, I'll remute.
Okay.
So, a positive LSI value indicates
that the factor favors the occurrence of landslides
so that the more positive the value,
the greater the correlation
between the spatial distribution of landslides
and the factor itself, or the factor class.
Negative LSI values indicate the opposite,
that it does not favor the occurrence of landslides
and it is a worst correlation.
So whenever we have a factor,
we have to think about how do we divide
that factor into classes.
So let's take an example of elevation.
Elevation is a quantitative data set,
meaning we can use ArcMap's built in classification tools
to do this mathematically,
by looking at the histogram of the data,
and it's the categories versus the frequencies.
And so, for the elevation data set,
we can break it up into three, four, five, eight, nine
different classes, it's up to the user.
And it can be classified based on using natural breaks,
manually entered in based on equal interval
or any of the other options that are available
in ArcMap or QGIS.
But there are data sets
that are a little bit more quantitative
and an example of this could be something like land cover
or aspect, the orientation of a hill slope, right.
And so, defining north is a little bit harder
to do mathematically because North could be
everything from 315 degrees to 360 degrees,
and then again from zero to 45 degrees.
So, I've had students use four categories,
North, East, South, West.
Some of them have done Northeast, Southeast, Southwest
and Northwest and some have gone and done eight categories
or six and broken it up a little bit differently.
And so here they have to go in and manually control
how these classifications are created.
And once we have these classifications done
we can start to process and calculate the LSI.
Now, this is where I stop because I know
a lot of times in my classes students don't fully understand
or have the strong computer literacy skills
to understand that what they're trying to do
with the computer that sounds so simple perhaps
to them at this point,
is a lot more difficult for the computer to handle.
So I show them this, I showed them that equation.
And we talked about the means that we have
to be able to input values into this equation
to get an LSI value.
And then the computer code that is created
in ArcPy for ArcMap to develop a tool
to automatically do this for students,
I usually bring up the sections of that like this,
and tell them that only the red portions
are the actual parts of the code
that addresses the calculation of LSI.
All of this other stuff is happening
prior or in the background.
And it's just added workload
and we have to think about these
and we compare them to cognitive processes.
Like import system modules and ArcPy.
Think about the problem.
Input parameters.
What parameters do you need and bring them in.
Input a workspace.
Where are you physically going to be
saving your data and working?
What data sets, do you need bring them in.
Right, and these are things they don't have to explicitly do
because the code does it for them.
But a really good exercise in getting students to understand
how difficult workflows and coding can be
is to have them create a peanut butter jelly sandwich
and give you a workflow for how to do that.
And it's a lot of fun because it can get a little bit messy,
'cause somebody will say,
"Take the peanut butter and put it on the bread."
And what I'll do is I'll take the peanut butter jar
and I'll put it on a slice of bread.
And then they'll be like,
"Well no, you have to open it first," right?
And then by going through an exercise like that
it reinforces this idea that the computer is doing
far more than what it may be visibly showing you.
All right, so at this point,
it's a good place for students to start
better understanding the datasets we're working with.
We're looking at Puerto Rico and Arizona.
And we're using the six factors that are listed
on the right hand side of this screen.
So just as a poll, a prediction,
which of these factors do you think
will have the greatest LSI values for Puerto Rico?
All righty.
So it seems like we've got almost a tie
between slope and mean annual precipitation.
Ironically, as it turns out,
lithology is very meaningful for Puerto Rico as well
but mean annual, precip and slope
are equally important as well.
Which LSI value will be the most positive
will depend on how students treat the data,
whether they split it up into five, six
or 10 classifications but I always have students
write down their prediction at the top of a page,
and they're going to come back to this at the very end
when we go over discussions.
At the end of the lab,
to see how well their prediction panned out.
And I also have students do this for Arizona as well
because that's the second data set that we have.
And we also talked about the why they think
it's going to favorite.
Now the two data sets are for Arizona and Puerto Rico.
And we have all of the prepared factors,
but they have not been classified.
And then we have the landslides, for each region
broken up into a 75 and a 25 category.
That's 75% of the total landslides randomly selected
and then the remaining 25%.
The 75% is the training data set to create
and generate the models.
And then the 25% is the test data set
they'll use in unit three to assess quantitatively,
the efficacy of each model.
Now in Arizona, the area within it only has 620 landslides,
but the landslides for Puerto Rico,
we're only using 2053 of them but in actuality,
there's 41,053 landslides, but that would overburden
a lot of people's computational resources,
so we reduce the data set for the purposes of this module.
Right.
So, in unit two students will ultimately
give a presentation and discuss a different factor classes
that they as an individual or in small groups
were responsible for.
So students do an oral presentation that outlines
their logical thought process to evaluate
and select factors and their classes.
And so I'm going to give you an example
of my students work on just slope and mean annual,
precipitation from the first time I ran the GETSI module.
So they noted that the critical angle and gravity
are the main mechanisms of slope failure.
A lot of this was learned in unit one regarding
what a landslide is and what are the physical,
what is the physical behavior of them
and the physical laws that govern why they move.
And they said that the steeper the slope,
the more susceptible an area is likely to be to landslides.
And so it will therefore result in a much higher LSI value.
So they ran two separate trials,
using two different classification schemes.
The first one they ran was with natural breaks,
and they had equal amounts of data per class
with three classifications,
and they picked three classifications for simplicity sake.
And the max LSI that they got was 0.777 or 0.78.
And, the overall their hypothesis held true here
that as we increase elevation, or sorry,
slope we increase susceptibility.
For trial to they did the same thing
but with seven classifications rather than three
and then they started noticing a weird pattern
that the higher susceptibility ranges
were in moderate slope categories,
from 14 to 27 degrees rather than in the highest categories.
So, as a poll between slope trial one and slope trial two
which one would you more likely lean to
as using as part of a model?
Okay, I think we'll end it there.
All right, so most people said trial two,
and my students agreed.
The reason they picked trial two was because
it had both the biggest range of LSI values,
and because it had the greatest maximum,
and they felt that there it was a better representation
of what was naturally occurring.
Whereas with less classifications
they felt like it was lumping up too much data,
where there may have been overlap
between low and high LSI values.
For precip they predicted a positive LSI values
for areas with an increased amount of mean annual precip
and negative for areas with lower.
And as this turns out it was a little bit different
that the higher the precip, the lower,
we got some pretty low values of LSI
but moderate ranges of precip
were a little bit more effective.
This was given a natural breaks classification
of nine classes, and then they did the same thing
using a manual method with four classes
and they saw the same sort of trend
that the moderate ranges of rainfall
were associated more with the presence of landslides.
Sorry, and I see here somebody asking,
"Isn't rate of precip almost as important
"as mean annual precipitation?"
Yes, absolutely.
And we do discuss that at the end of discussion
with students when we go over each,
each different factor and why it was used
and why it may not fully represent what they're looking for.
So between these two trials,
let's see what you all think is the best one
to move forward with.
All righty.
So most people went with trial one
and my students also agreed with that
for much of the same reasons as before,
largest range, greatest maximum, smallest minimum,
showed the most variability that could be applicable.
Now, at the end of the unit discussion,
I talked to students and these are just some samples
for this unit to think about their home area
where they're from and what factors they think
will have the highest LSI values.
I have them think about additional factors
beyond the six we tested and a lot of students
throughout soils, or they look at vegetation types
or they want to look at peak ground acceleration
or shake ability.
And so this unit introduces the initial part
of the predictive modeling that we continue in unit three.
And I asked them, how simple do you think
predictive modeling is now,
and these are related to some pre-questions
that I asked them leading up to unit two.
So, what other things could we apply
the frequency ratio method to other than landslides,
see what people suggest.
I'm seeing some people type in floods, earthquakes,
wildfires, tornadoes.
Yeah, and it's applicable to a lot of different methods
or to a lot of different data sets,
as a term project one of my students looked at,
for instance, was it, not hotspots but hot springs,
the location of hot springs and tried to see
if he could predict if there were additional locations
worth, establishing a hot spring.
A a hot spring spa.
All right, so in unit three we take it a bit further
'cause now we're starting to think about modeling
more explicitly rather than just comparing factors
to the presence of landslides.
So I have students go through and better understand
what I mean when I say the word socio-political landscape
and we break it down into the three categories.
And it's important here because not a lot of them
fully understand what a socio-political
landscape is referring to.
And then we have them think about
the socio-political landscape of your own country,
your region, and why the socio-political landscape
is the way it is.
For my region coming from northeastern Pennsylvania
we've got a lot of anti-science going on
in the background here,
we've got a lot of questioning of government.
And this isn't to say it's right or wrong
but it kind of helps define how policy can be developed
and how society can be benefited by these policies
in a given context of a socio-political landscape.
We also have them look at describing
the national socio-political landscape
as it pertains to science kind of getting at some of these
little, not little but very big topics
in mainstream media especially that are very anti-science.
And we ask them what is the intended use of a model,
is it to predict, and sure but who does it go to
does it go to politicians,
is the intended use for the everyday person.
And we talk about model limitations
and why those limitations might exist.
So I further asked them,
what role models personally play for them
and weather is the big one that comes up all the time
and students say, "Well I don't trust so and so,"
naming some sort of weather person on,
I guess a news source.
And then some say, "Well I do because of this,"
and then so we get to have some
really great conversations here.
We talk about what roles models play in society
and why people tend to trust models or not,
and why they might disregard them,
and what role models play in overall politics.
And we think about what aspects of the modeling process
make it vulnerable,
when the results challenge society,
policy or economic status quo.
And we question, if there's anything
we as scientists could do to help people better understand
the value of models.
And I find this a really helpful discussion
to have with students because it really establishes
for them value in generating a much better model
for unit three.
And we use a receiver operating characteristic curve
to evaluate their models
which is just a combination of different LSI maps
that they generated in unit two,
and they decide what factors go into it,
what factors don't.
I have them generate at least
two different models to compare.
In some cases, if the groups are large
I had them do five different models.
And when they plot the true positive rates
versus false positive rates and look at the area
under the curve,
the area under the curve is the value
that's really important here.
And values less than point five aren't really possible
but values that are very close to point five
mean the model is not a good predictor.
But those closer to one means the model
is an excellent predictor.
So as a poll here looking at the image on the right,
which is the final susceptibility model,
which we'll talk about qualitatively in a bit here
but quantitatively does anyone have a good prediction
of what they think its overall accuracy will result in
or its area under the curve?
Can we get this poll up or is it.
Oh, there we go.
Yeah, so I'm just asking just for the susceptibility model
that was generated on the right
with red regions being very high susceptibility
and dark green being low susceptibility,
yellow in the middle.
What the predicted area under the curve would be
for us looking at it.
All right.
So it looks like most people set somewhere
in the 0.6 to 0.7 range
and that's actually really, really close.
The area under the curve was 0.706.
So, quantitatively just looking at this,
with it kind of being in the mid-range.
Let's take a poll here to see if you would or would not
use this model as a predictor.
All right.
This is coming out a lot like election results,
really really close, but yes this is a almost
like a 50-50 here.
And the students really have an interesting time
answering this as well thinking about,
is it a good predictor, or is it not
and this lays the foundation for saying,
Well let's assess it another way beyond just quantitative.
And one way we could do that
is through a qualitative analysis.
Hey, Bobby, there's,
I think there's a couple questions there sort of indicating
that people might not be 100% sure
what this area under the curve is.
Do you think you might be able to back up
and just help them with that.
Yeah, so the area under the curve here.
For each cut off between very high, high susceptibility
or low, moderate susceptibilities
for each cut off in this final image
that defines each of these colors.
We look at how many landslides fall above the cutoff
versus how many total landslides there are,
and that's the true positive rate,
using the 25% data set
that we had left over from the landslides.
That was in the data set itself.
And then we look at the number of pixels
that are identified as a false positive,
compared to the true positives, as the false positive rate.
And when we plot those we get a nice little curve
like this one, hopefully.
And if we fit a polynomial equation to it,
and integrate from zero to one, to calculate its area
under the curve, we get a prediction
of how accurate our model is.
The closer our value is to one
the most accurate it is,
whereas the closer it is to 0.5 the least accurate it is.
Hopefully that helped.
And then we get to a qualitative analysis here.
And I'm going to skip the poll on this one,
but qualitatively, the students do tend to pick up
on there's regions where let's say right here
I'm not sure if everyone can see my mouse
but there's red regions juxtaposed to moderate regions,
and there is no gradient going from very high susceptibility
to high susceptibility than to moderate.
And then you have a section
up in say down here that has very high
juxtapose with very little gradient between
right next to very low susceptibility,
and the pattern nation doesn't necessarily look
particularly natural to what we understand
about the earth's surface.
There is a lot of gritted boxy cut offs
and those don't seem right.
So, these are all part of this qualitative analysis
to say that while the area under the curve may be sort of
in that moderate range,
the qualitative analysis of this is telling us
it's not necessarily an accurate model.
So, these are two models trial one and trial two
that were created by my students for Puerto Rico.
And these were the area under the curves
they figured for each model
and these are the individual factors
that they considered for that model.
And it's just an average of each factors LSIs.
So in the chat box would you pick trial one or trial two
as the better model?
So we're getting a decent mix of trial two and one.
They're both very similar.
But if we go on and look at the difference
between trial one and trial five.
Right, they look a lot more different
and their areas under the curve
are still kind of within that same range
of it's moderately okay.
But we get an opportunity to talk about scale.
If, and the purpose that somebody needs this model for.
If the purpose is to decide where to live
and where to construct a home,
the high resolution model which is trial one here,
would probably be more ideal.
But if we're really interested
in just kind of generally understanding where landslides
are more susceptible, trial number five
may be adequate for our purposes.
And the students in unit three are tasked
with doing a poster presentation
of their overall methods and discussion
and results regarding the final susceptibility modeling.
So at the end of this unit,
we discuss what roles their models could play
for individuals in both society and politics.
Why someone could disregard your group's
final selected model thinking about how to communicate
a proper defense.
And asking students how their opinion
of modeling, its results and its uses and guiding policy
has changed and if they find themselves
more or less critical of modeling
than they did perhaps before.
And we end by asking students
how simple they think predictive modeling is now
compared to the response from unit two.
So other challenges I've faced is students
have troubles with computers,
not all computers are the same,
not all networks are the same.
So depending on where they are and what lab,
if it's a home computer or not.
There are different challenges.
And I've always given students this rule of,
you must try three different troubleshooting techniques
before coming to me, or the TA.
And the purpose of this is not to make them struggle
but it's to help them develop troubleshooting skills
in that process.
And so when a student comes to me and says,
"Well I don't know what's going on here."
That's different than when a student
having tried three things can come to me
and say, "I don't know what's going on
"and I've tried this this and this."
That helps narrow down what could be the problem on my end
so I'm not spending as much time with individual students,
and instead can mitigate each instance
of a problem in a shorter amount of time.
A lot of students, to this day,
and I still don't understand this one,
think that saving in ArcMap is not necessary
that it just does it for you.
Ctrl+S, I see somebody put that in there
it's just second nature in my hand,
is if not on Ctrl+C or V, it's on Ctrl+S, all the time.
And I every so often just without thinking about it now
I'm hitting Ctrl+S and saving.
And if you have one faculty member and many, many students
troubleshooting becomes a nightmare.
And this is why that trying three things
before you talk to the faculty member,
really saves a lot of time and brings the burden
from of troubleshooting off of the instructor
and puts it more so on the individuals.
So if you're going to offer this type of the module online,
you can use Zoom or similar software
to allow screen sharing and this makes troubleshooting
very simple I've been doing it with a lot of my students
who are going through a lot of GETSI units right now.
I know asynchronous lectures are what we prefer
but synchronous lectures are helpful.
And I have synchronous lectures that I'd say at least
70% of my students attend.
And then I upload the recording of that lecture
at the end for those who couldn't make it.
But in that synchronous lecture you can troubleshoot
a lot of problems about understanding
when students get the opportunity to ask questions.
I would move a lot of the discussions
to an online environment prior to a synchronous lab overview
for the hands on components,
and this will help kind of facilitate those connections
between society, politics, cognitive science, math, etc.
I encourage students to help one another.
So I team them up in groups of four
and those are their troubleshooting buddies
and if they can't figure it out the four of them
bring it to me.
And if all else fails,
you can use the pre-made data for each unit
and focus more on analysis and interpretation
rather than data processing.
I guess, Beth I will stop sharing and let you take over.
Beth you're muted.
If you meant to be.
Okay.
Sorry, my apologies.
I know we're nearly out of time
but I just wanted to give people
a visual on the GETSI webpage and mention
that in addition to the majors level module
which you've heard so much about
there is an intro level surface process hazards module,
the majors, level one has this type of a look to it.
And for instance if you went into unit three,
you would see that there's detailed descriptions
with a lot of the different assignments
and also the data sets and teaching tips
and things like that.
I know we're pretty much out of time for questions,
and people should feel free to leave if they would like.
But if anybody wants to ask some questions
about teaching the module or some suggestions and comments,
we can hang in here for a few more minutes.
The leaders and feel free to pose those questions
in the chat box and I do encourage you
to take your own notes about your impressions
and things you might use.
And maybe while people are thinking
if they have any questions I'll just go on to
the final page here and say we would really
super appreciated if you fill out the webinar evaluation,
and to mention that next week on Wednesday,
the next NAGT webinar will be
Suddenly Teaching Geoscience Online
with a panel of people who have experience with it,
and a month from now we'll be having another GETSI module
on flood hazard.
So thank you so much.
And if people do have lingering questions
well looks like there's a couple.
We'll stay on and try to answer them.
So, Bobby, your people are having to run,
let's see, What type of independent research projects
can students do after learning these modules?
So it's actually really interesting.
I've had students like I said,
look at hot spring distribution.
I've looked at students look at AMD runoff.
They've tried to use the by variant method
or the frequency ratio method
to look at the spatial distribution
of a lot of different types of data.
So I think that really helps for them
putting it together and it helps them in kind of
thinking about prediction.
We've had a couple biology students
use this to look at the density of tree species in a region
versus different environmental factors,
that were classified and processed
using the frequency ratio method as well.
So there's a lot of different ways
that they can approach this
but I think more importantly beyond the math,
it's the computational skills
of having more familiarity,
having worked more in depth with ArcMap and CloudCompare
and seeing multiple different software,
and they use Excel in this as well.
So it just helps them familiarize and use those as tools
to solving scientific and engineering related problems.
A question.
What's your opinion about weight of evidence?
Could I get a clarification on that one?
I'm not sure what you mean by the weight of evidence.
If you want to unmute and ask verbally that's fine.
We're not hearing anything.
By variant method to predict landslides.
I'm neither here nor there on it.
I specifically, don't do waiting
for this particular approach, but you could.
We've had students in the past attempt
to weight their data so that the maximum
or the maximum absolute LSI value in any factor,
for any factor was used to normalize everything
from negative one-to-one.
And they got very different results
that were perhaps not as accurate.
Though I'm sure that in certain cases
it might be more accurate.
But for the purposes of this module
we use just the singular method
and push forward from there
but students are free to adapt it,
the methodology if they can make
some sort of sense of it on their end.
Okay, that looks like it.
Thank you so much for everybody for your attendance.
And thank you so much Bobby and Mitchell
for helping to run the webinar.
