>> Eric Winsberg from University
of South Florida.
I met him at a conference on validation which
is actually fascinating a few years back.
There's a handbook coming out with
contributions from the people who attended
and in the reading process, I end up talking
to him about his book, "Science in the Age
of Computer Simulation", which
is absolutely fascinating.
I can't resist plugging it.
I learned that-- what is it?
Silogen is thing--
>> It's a new book too by the way.
>> Oh, it's a new book too, OK.
>> Which just came out a couple of months.
>> "Philosophy and Climate Science".
>> Makes great just skip.
>> And so, Eric is an undergrad in the
University of Chicago, right out of Indiana
and then have a postdoctorate Northwestern
before South Florida and today he's going
to tell us, "Tuning model for skill: when
prediction better than accommodations."
So thank you.
>> All right.
Well, thank you very much for inviting me
out and thanks for everybody to taking time
out of your day to come listen to a philosopher.
I'm sure that's not something you
spend that much of your time doing.
OK. So I want to talk about-- I want
to talk about sort of two things here,
simulation models that need to be--
I'll just stay close to this, I guess,
stay tuned-- that need to be tuned.
And then, I want to talk about this
question of it seem to resonate
with people I've talked to
this morning in various ways.
When is it better to be able to
predict novel data than just to be able
to accommodate data that you already have?
I think we all have the intuition that that's a
better thing, but why and when and exactly how
and how does it relate to
tuning in model skill, OK.
So what do I mean-- what happened
here, I thought I moved here, OK.
So what do I mean by computer skill,
computer simulation skill, first of all?
So often, I take in-- this is a term
that climate scientists use a lot
when they talk about their simulations.
They're very interested in the
question of when their model have skill.
And I think what they mean by skill is
something a little bit different than just fit
between the model in the world or accuracy or
correctness or truth or anything like that.
So it's a kind of purpose-dependent
notion skill, right.
So it's the adequacy of a
model for a particular purpose
but one where prediction plays a significant
role in achieving that purpose, right?
So I think skill is not the same
as I says, as fit or veracity
or truthlikeness or any notion like that.
And in particular, right, sometimes
I think skill can be improved
by reducing some aspects of fit, right?
Some of you can make a model more
skillful in some way or other even
by reducing its truthlikeness
in some other respects.
So that's one, at least one important ways
in which skill is different
than just simple fit, OK.
I want to talk about two different kinds
of examples today just to
kind of drive my discussion.
I'm going to be talking a
little about climate modeling.
And that's what I've been thinking about
most as you saw from my most recent book.
And I also have been talking with and
collaborating with a little bit with people
who do cosmological simulations,
particularly ones interested
in testing various dark matter models.
So trying to figure out what the various
models predict about galactic behavior
and larger scale phenomenon
in galactic behavior, OK.
So-- and then I think today's
main question is going to be
when is novel prediction more
important than accommodation and how is
in establishing model skill and what are
some of the sort of nuances of that, OK.
So I said I talked about two examples
just to show you-- so this is a--
just to show a quick videos here just
to-- because it's a computational center,
so we should look at some videos.
So this is just an output of
the GFDL climate model, right.
Where did the mouse go?
There we go.
So they're just evolving forward the atmosphere,
you know, using some combination of physics
and other modeling techniques
to kind of drive that behavior.
They're doing experiments to try to
figure out what various forcings will do
to the climate system after some period of time.
So you want to know things like if we
add, heat trap in gas into the atmosphere,
what's that going to do to the long term
behavior of the climate both on global skills,
regional skills, in the oceans, in
the atmospheres, et cetera, right?
So there the notion of skill is you want
to know, right, is your model skillful
at making those various kinds
of forecast for you?
Is it skillful at telling you is global
means surface temperature going to go
up by some certain amount if we add a
certain amount of carbon to the atmosphere?
Is drought going to be more prevalent in
some places or precipitation more prevalent?
So skill, right, is a kind of a very pragmatic
feature that we attribute to the model.
It's not exactly the same
as the model being correct.
And then there's-- the other
kind of example was-- here we go.
The other kind of example is quite, sort
of different in a certain way, right?
This is here now people simulating the clamping
of dark matter from its initial condition
that you kind of get from the
cosmic microwave background, right?
And they want to know for different models of
dark matter, let's say what do they predict
about what the observable
structure of the universe will be.
Do they predict that we'll see arrangements
of clusters in galaxies in the way that we do?
Do they predict rotation curves for galaxies
of the kind that we see et cetera, et cetera?
So in both cases, right, the
goal I think is rather pragmatic.
It's different in each case, right?
In the climate case, we kind
of know the physics, right.
Of course, we can't calculate perfectly
with the physics but we know the physics
and we want a predict or as the IPCCS calls it,
we want to project what that system is going
to do under a forcing that we've never
experienced before, right, unfortunately.
And in the cosmological case,
right, it's kind of the opposite.
We're uncertain about what
the correct model is, right?
And you can kind of-- the
popular view is something
like Lambda cold dark matter but their arrivals.
And we want to see whether those various
models reproduce known observables,
but both of those depend
on being able to ascertain
that our simulation models have skill
mainly that they're good at pragmatic tasks
to which we want to put them, the one case
at pragmatic task of making a forecast
about how the climate is going to be in the
other case the pragmatic goal of figuring
out what observables we would have,
what observable characteristics
of the cosmos we would get under various
different possible dark matter models, OK.
So these where I take features that
both of these models have and many
of you I think run computational models.
I hope most of you recognize these as
features that your own models have.
They made substantial approximations, right?
We often don't have.
We would love to have good mathematical
arguments that the idealizations
and approximations, actually the
discretization that we're making are innocent.
Well, usually I take it in most of the
disciplines of people I talked to working
in domains whether or not rigorous
mathematical arguments, right,
that are approximations are innocent.
So, we know often that important
dynamics in many of these.
But, certainly, but these cases-- but
I think probably in many of the cases
of people I've talked to
this morning for example,
we know that important dynamics are
obscured by relatively large good sizes
that we're using in our discretizations.
And so often it's also a feature of
many of these models that we use,
what climate scientists call parameterization,
what other people sometimes are
going to call subgrid physics, right.
So in a climate model-- I can't really
walk away from the mic here, sorry.
In the climate model, yeah, in the climate
model, right, we have pretty large grid cells.
Some are between 10 kilometers and a hundred
kilometers horizontally, and then maybe 20
or 30 layers deep vertically, right.
And we know that lots of interesting
things are going on inside that grid.
Some of these are being depicted here
but in particular cloud formation
is obviously something the physics
of which is much smaller than a grid size.
It's probably in fact millimeter or
micrometer even the physics there.
So, what's going on inside those
grids as being parameterized, right,
it's not real physics that's driving
that aspect to the simulation.
It's a kind of replacement
physics that happens--
that's happening in a different scale than
the one that we're capable of resolving.
And then, the idea of parameterization
is closely relate to this other thing
that I want to talk about which is tuning.
So, tuning is definitely an important and sort
of unfortunate feature, right,
of climate modeling.
So, tuning I take it as this process of
adjusting some of those parameter values,
kind of late in the game in order
to get the behavior of the model
as a whole to behave in an acceptable way.
So if you're all familiar with climate models,
you might know that they all have this feature
that they kind of lose energy
at the top of the atmosphere.
It's a kind of hard to fix feature of
the basic mathematical approximations.
Again, they lose energy at
the top of the atmosphere.
And kind of one of the last steps of
making a climate model is once you've kind
of put everything together in the way
you wanted to be, you then compare the--
you then compare what the model
is telling you about the level
of radiation at that top of the atmosphere.
You compare it to the real world and then you
adjust certain key parameter values in the model
in order to get that to balance
more accurately than it was, OK.
So you are adjusting parameter values
of this, in a way kind of fake features
of a model parameters after all these
are, we're talking about this morning,
these parameters are not
real physical values, right?
These are kind of artifacts of the model
but even with respect to that you know
that you're now in a climate model,
you're often adjusting them away
from what would otherwise have been their ideal
value in order to compensate for what are hard
to fix errors that are elsewhere in the code.
OK. So some climate scientists often talk about
this as a balance of approximations, right.
You know that the basic computational
scheme you're doing--
you're using has this unfortunate side effect.
It causes this radiation leak, great figure of
expression, right, cause this radiation leak
at the top of the atmosphere, you compensate
for that by adjusting those parameter values
where the parameters were at the-- you know,
in the first place kind of made up values
that were there to compensate
for other errors, OK.
So there's a lot of balance of
approximation there and skill then, right,
skill in a situation like that is clearly
quite disconnected from verisimilitude
or truthlikeness or fit or anything, at terms
like of the kind of a literal comparison
of the model in the world because
here is a nice like clear case
where you are adjusting those
parameter values away
from in some sense what you might
think their ideal values are, right?
So if I asked somebody who was expert on
clouds, you know, in this parameterization
of cloud formation in that grid box is
this value of the parameter about right?
It's no. It's probably much lower than that.
But you've raised it higher than that in
order to balance out some other features.
And by balance out, what we mean is, right,
get output that meets my pragmatic goals
where my pragmatic goals might be
something like making these forecasts
about what global mean surface temperature
going to be like at the end of the century
if we don't get our act together.
OK. So it's a last step and it's kind of done
to increase the model skill in some sense.
I don't like you [inaudible]
that doesn't do a thing.
OK. So here are some-- I'm just going to read
you a couple of quotations about model tuning,
and these are all going to
be from climate scientist.
So this is right out of the IPCC report.
This is-- I don't know if you're-- how many
of you know all these IPCC jargon, right.
This is the fifth assessment report.
This one came out a few years ago, working
group one is the basic science group
and this is box 9.1 I think from
the summary for policymakers.
So they say model tuning directly
influences the evaluation of climate models,
as the quantities are tuned
cannot be use in model evaluation.
Quantities closely relate to those tune will
provide only weak test of model performance.
Nevertheless by focusing on those
quantities not generally involved
in model tuning while discounting
metrics clearly related to it,
it's possible to gain insight
into model performance.
OK. Short version, right?
If you've used the date set to tune your model,
you can't use that data set to test your model.
Pretty obvious basic claim
you might think, right, good.
Here is a-- this is one of the most revealing--
you don't find many good papers about model
tuning because it's of course not a feature
of climate models that climate scientist are
specially proud of but there's a nice level
of paper by from 2012 by Mauritsen and others.
And they talk about this.
So they say, including qualities in model
evaluation that were targeted by tuning
as of little value, evaluating models
based on their ability to represent at top
of the atmosphere radiation balance usually
reflects how close the models were tuned
to that particular target rather than there
any of their intrinsic qualities, right.
So if you've tuned the model, if you've
tuned the model to get at the top
of the atmosphere balance right, don't herald
top of the atmosphere radiation balance,
its accuracy as, you know, sign that
you've got a great model, pretty obvious.
I think I'm going to skip to this.
So this was-- this were a couple of
philosophical, philosophers right
in the common sense of colleagues in
mind and they said something that a lot
of people found counterintuitive.
They use the word calibration instead of tuning.
It's the same thing, they're
talking about the other, right.
Calibration is simply the common practice
of testing hypothesis against evidence,
double counting, right, that kind of
using the same data to test as you use
to calibrate is perfectly proper.
And in particular, I say, right, by Bayesian
standards and I'll say a little about that
in a bit, double counting simply amounts to
using evidence in the regular and proper way.
So they were taking this what you might think
is rather counterintuitive point of view.
Gavin Schmidt who follow climate science
stuff and all you might have heard of,
he's kind of the, I think the lead climate
guy at NASA now, wrote a response actually
in the philosophy paper, so
that was kind of fun to this.
And he said, no, you know,
Steele and Werndl, that's nuts.
It's uncommon on those who develop models to
know that they have and have not been tuned
in order to avoid inappropriate
conclusions from successful tests, right?
The lurch has been murky on
this historically but we need
to be more honest and clear about this.
We need to be perfectly honest that when
we've used the data set to tune our model,
we cannot herald that agreement
between our model and that data set
as evidence in anything, good, OK.
So why did Werndl and Steele--
what was their argument?
I'm going to sort of look at their argument
and see if it tells us anything interesting
about what's going on in the difference
between prediction and accommodation.
So they're employing a kind of-- you think
of this is, most of you to the extent
that you might know about Bayesianism, you
might think of it as a statistical methods,
we think of it in philosophy as a kind of model
of scientific influence because it's a kind
of clear, it's a kind of clear account of
what goes on when you evaluate hypothesis
in the light of evidence,
right, and it tells you this.
It tells you that the final probably
of some hypothesis S is just equal
to the probability evidence
given that hypothesis multiplied
by your prior probability
of the hypothesis, right.
So the PS on the right is the prior
probability of the hypothesis,
the PS on the left is the posterior
probability of the hypothesis
or final probability of it
and it just follows this.
This is just a-- this just follows
from the axioms of probability, right?
Most people are familiar with Bayes' Rule, OK.
So they say, OK, but look, Bayes' rule
doesn't seem to mark any difference
between evidence that's collected
before model evaluation
and after model evaluation evidence is evidence.
In fact, they run through the calculation.
They use-- you know, here you have S1
is some model that's being evaluated
with prior evidence, S2 is being
evaluated with posterior evidence
and the probability seem to come out.
And say, uh-huh, look Bayes' rule shows, right,
that double counting is perfectly probable.
OK. What did they do wrong?
What did they wrong?
What they did wrong is they forgot
an important element of Bayes' rule
which is your background knowledge, right?
You should always include
in all the conditionals
of all these probabilities,
your background knowledge.
So it's-- you're interested in the
probability of that-- the final--
probably the hypothesis given all the other
things that you knew and given that you believed
in laws and that you believe
that there's no free lunch
and that you believed all the other things
that you happen to believe in your life, right.
So, it's important, it's often suppressed
because it's kind of taking to be obvious
but it's important to remember that background
knowledge is part of the conditional of these--
all of these conditional probabilities.
And once you remember that, right, so that
letter B there is our background knowledge.
And once you remember that in fact Bayesianism
kind of get presents the opposite conclusion
to the one that Werndl and
Steele were suggesting
and it's kind of a puzzle, right, in a way.
Why is that?
So this was when philosopher of science were
first kind of thinking about Bayesianism
as a model of scientific inference,
in the 1980s Clark Glymour wrote
at this paper, why and not a Bayesian.
And his argument that you
shouldn't be a Bayesian is
that Bayesianism hasn't had what he
called a problem of old evidence, OK.
So those of you that are astrophysicists
I'm sure you know about this, right.
Newtonian gravitation got the possession
of the perihelion of mercury wrong,
was a well-known problem in the late
19th century, early 20th century.
When Einstein developed the Theory of General
Relativity, it did much better job of predicting
that possession of the perihelion mercury.
That was considered to be a very
strong consideration in favor
and accepting the general theory
relativity and second only
to the Eddington starlight bending observations.
But that's despite the fact, that's despite
the fact that Einstein was well aware
of the inaccuracies of the possession
of the perihelion mercury according
to Newtonian gravitation, right.
And so if we go back and we look at Bayes'
rule, right, anything that's old evidence--
so I'm going to walk away from the mic here.
Anything that's old evidence is then
the part of your background knowledge.
If it's old evidence, it's part
of your background knowledge.
And if it's part of your background
knowledge then probability in E given some set
of knowledge that includes E is going to be one.
So that's going to be one and that's going
to be one and then your final probability
of the hypothesis given your background
knowledge is going to be equal
to your initial probability, right?
You can't get any boosts according to the
Bayes' rule, can't get any boost in your degree
of belief in the hypothesis from old evidence
because it's part of your background knowledge.
And this was Glymour said, this
is why I'm not a Bayesian, right,
Bayesians have this problem of old evidence.
Most people thought, oh we
can give a response to that.
And you give a response to that.
And there are various responses,
one popular one is to distinguish
between knowledge that's genuinely novel, right.
New things that you-- literally new things
that you find out after you make your model
and things that you may have known before
but you didn't use in producing your model.
So some Bayesians came up with
this idea, we need to distinguish
between use novel information
and genuinely novel information.
And then if something of used novel, we
don't include it in the background knowledge.
So, I think this is my slide here,
right, my revised Bayes' rule.
It says, I don't actually calculate
on these probabilities with the real B
which is now a full set of background
knowledge, I use B prime instead
where B prime is everything that I knew
except for the things that we're used novel.
So for example, in Einstein's case we
leave out the procession of mercury from B
because we happen know that or at
least think that Einstein did not use
that information in constructing his theory, OK?
So use novel information is allowed
to be left out at the background thing
and now Bayes' rule works again
and we have a response to Glymour.
OK. There are other responses, it doesn't
really matter whether you accept this response.
Some other people think it should
be part of what you'vr learned,
so logical facts that you come to know might
be information that maybe Einstein didn't know
that the perihelion fall from his theory
so that's some new thing that you learned,
various proposals for how to respond
of Glymour's worry about old evidence
but the point, the main point is, the main point
is to the extent that Bayesianism has a problem,
its a problem with explaining why
old evidence counts it all not
with explaining why double counting is bad.
So I think Werdle and Steele
got that exactly backwards.
OK. So the B prime here is your-- not
including your used novel background knowledge.
OK. So this is my main point here.
My main point is portray what Werndla and Steele
said, Bayesianism does not have a direct lesson
for us about how good old evidence says versus
how new evidence that your attitude to that has
to be put in by hand into your
philosophical framework, right?
Bayesianism is mute about whether old
evidence is better than new evidence.
In fact if it says anything, it says
the relatively and plausible thing
that old evidence is always completely worthless
and that seems wrong, right, that seems wrong.
So it seems like something
has to be put in by hand
into Bayesianism to accommodate old evidence.
And it's not clear what we put in.
OK. So we're left with lots of options,
right, and Bayesianism does not tell us
which of these options is correct.
We're left with the possibility that
old evidence is never allowed, right?
That seems kind of implausible, right?
So in other words, everything should
be in my background knowledge.
You might think that old evidence is
only allowed when it's used novel.
So nothing that is not used novel, old
knowledge ever counts as evidence of anything.
You might think old evidence is always
allowed and it's just as good, right,
that seems to be what Werndl
and Steele are arguing.
That seems also implausible.
You might think old evidence is allowed
but it's inferior to new evidence, right.
In other words, accommodation is-- gives
you some boost in your model and you believe
in your model but it's inferior to prediction.
OK. So there's lots of options
here, there's lots of options here
and the Bayesian framework itself
was kind of mute about that.
That's the main point I want to make so far.
So different attitudes you can have,
different attitudes you can have to prediction,
accommodation and just saying that you're
Bayesian doesn't solve the question of--
I got to put this down because I think
it's going to move the slides really, OK.
So let's call predictivism, let's call
predictivism the philosophical view
or the family view, any view, right, according
to which new evidence is at least sometimes,
somewhat better than old evidence, right?
So if you don't have the view that old evidence
and new evidence are always equally good,
always equally good, then you're predictivist.
So everyone that I put other than
Werndl and Steele are predictivist,
I assume most of you are predictivist of some
variety rather I think when I tell most people
that there are these philosophers who have this
non-predictives' view, they think that's nuts,
right, surely sometimes using
the day that you tuned you model
with to evaluate the model
is just bad methodology.
So we're all predictivists of one
stripe or another, I think, right.
But there are sort of three
questions that you can ask,
three different things you
can ask about a predictivist.
A couple of these are already mentioned, right.
Is old evidence worthless or is it just weaker?
Is old evidence always weaker
or always worthless
or just in some circumstances but not others?
And then the third question which I
haven't mentioned but I want to talk about
and that is the question is
evidence-- is old evidence, sorry--
is new evidence intrinsically
better than old evidence?
And that's a very sort of,
I think, philosophers--
that's a very philosophers
kind of curve enough of things.
So it might take me a minute to explain
what I mean by being intrinsically better.
So here's one reason to think that
it might not be intrinsically better.
And this is what some philosophers
would call the paradox of predictivism.
And the paradox goes like this, right.
Intuitively, we tend to think that
conformation, that's the word we use to mean
like how you evaluate your hypothesis in light
of evidence, it should be just a relation
between theory in model on the one
hand and evidence in the other.
The psychological or biographical
properties of the person
who made the model should be irrelevant to
how well the evidence supports it, right.
And here's a little thought
experiment that maybe is meant
to prime your intuition for that, right.
Imagine that I came to you and I said I
found some secret diaries of Einstein's.
I was just telling people on the walk, however
we did just find some secret of diaries
of Einstein's and they were
not very nice apparently,
it's some disappointing things
to say about his travels.
But anyway supposed we found this imaginary
diary and in it we find out that Einstein
when he was constructing the general theory
of relativity did very closely appeal
to the perihelion of mercury data, right.
So now we find out that the perihelion of
Mercury data is not used novel in the way
that we thought it was a minute ago.
Are we really going to change our minds about
how well we believe in the theory of relativity
as a result of discovering this
peculiar psychological biographical fact
about Einstein that seems implausible?
So the idea that this might suggest then is
that while we're all probably predictivists,
we all think that prediction is better
than accommodation at least in some cases
at least somewhat better, maybe
it's not intrinsically better.
Now what do I mean by that?
So-- OK. So it's the idea that it's not--
I'm going skip this slide and go to this one
to kind of explain the idea first.
And then I'm really going to-- Well,
I'll try to not walk away from the mic.
OK. So here's what you're meant to imagine.
You're meant to imagine, first of
all, that you encounter a scientist
and she has collected these
four bits of data, OK.
And on the basis of collecting those four
bits of data, she says, here's my hypothesis.
And she just connects-- she connects out those
four pieces of data with a straight line.
And now she says, oh, I just
found a fifth piece of data.
Do you want to know whether my model
successfully predicts that fifth piece of data?
You'll probably say, what
would people think, right?
How much do you care now
whether that fifth piece
of data is properly predicted by the model?
Show of hands like a lot, a little, OK.
How about in the second case?
Now in the second case, scientists has
five bit-- well, how many are there,
six bits of data here, right,
six bits of data here,
and now a seventh bit of
data is going to come in.
Are you as eager to know whether this model fits
the seventh piece of data, more eager to know
or less eager to know than in the first case?
How many people are more eager
to know in the second case?
Is that all the hands I can get?
I was hoping to get a few more hands than that.
Why is-- why might somebody be more--
tell me why you might be more eager
to see the seventh data piece in that?
How about in the back, yeah?
[Inaudible].
Can you say a little more?
I think we're on the same page but can you--
[ Inaudible ]
There you go, right?
There's-- I think that maybe
somebody else want to say why they--
somebody else who put their hand up,
why do you care more in the second case?
[ Inaudible ]
Yeah. So here's how I would put it.
Here's how I would put it.
The model in the left kind of wears
all its virtues on its sleeves.
We can see exactly if that's the right model.
That's perfectly obvious why that's
the right model, it's a linear model,
it's people's three acts or something, right?
So it just looks like-- it looks prima fascia,
pretty strong that this scientist
has gotten the right model.
It's pretty-- we're pretty-- we're going
to be already pretty convinced
that this is a good fit, right?
And I take it, you're not going to be--
you'd be pretty surprised if
the fifth data piece came along.
Over here, you're going to be really
impressed if that seventh data comes in, right,
because it looks like this person
just played connect the dots, right?
It looks like this person
just played connect the dots.
Here's one way to think about
what's going on then, right?
We care a lot about prediction.
We care a lot about prediction when
there are features of the model
that we can't really assess
very well for ourselves.
There may be hidden features
of the model, right?
This feat-- this model here
has no hidden features.
It's a linear model.
You know exactly how good that is.
You could-- you would know as much
about how good that model is by looking
at it the same thing, but over here, right, this
is I think what people were kind of getting at.
I take it in my-- in the audience, right?
This model has so much structure but, of
course, it's completely hidden structure.
You don't know anything about why this model
like wiggles like that or whatever, right?
So now you're really eager to
know whether it can predict.
Because if it does predict,
right, you're going to take that.
And this is the idea I want to push.
It does predict that correctly.
You're going to take that as evidence that
whatever that complex structure that was
in there, it had features that were
good that you are unaware of, right?
There were good features in that complex
structure that you are unaware of and the fact
that it was able to predict is
a sign that had you been aware
of those features you would have endorse them.
So that's the idea, that's the idea of this
solution to the paradox of predictivism
which was due [inaudible], right?
It's this idea that the ability to make novel
prediction is not intrinsically epistemically
good, it's a sign of some
hidden positive qualities
of a theory or a hypothesis or a model, OK?
So the idea is if you had-- if you're kind of--
have great sort of God's eye view of what's
going on in the model and in the science
and you could kind of see every aspect
of it, in that situation you kind
of don't really care what's a
prediction and what's an accommodation.
But it's when there were kind of hidden things
where you can't really get under the hood
of the model and see all what all of
its good features might be, right?
So-- And that kind of-- I think that's
supposed to support the intuition we have
in the Einstein case that look we know
what, you know, general activities
like the only way we can come
up with of giving a relative--
basically covariant rewriting of
Newton's laws of gravitations.
So it's like, it's that or nothing.
And so when it predicts or accommodates
or whatever, we think OK whatever, right?
Whatever this, whatever virtue is
this theory has we can see them.
And so whether it's predicting or accommodating
is not really that interesting to us.
But I take it.
When we're evaluating models for
skill, these were all-- right?
These were the slides that we're going through.
OK. So this is the theory that you
might call tempered predictivism, right?
It's the idea that when prediction is better
than accommodation, it's not so intrinsically.
It's not part of the real logic of the
relation between evidence and theory.
It's because when somebody can predict,
we take that as a symptom that his
or her model has some nice features that we may
or may not be able to see for ourselves, OK?
In the case of the straight line there, it was
like-- well, that line that model is so simple,
that's obviously our virtues model.
So the idea then is that prediction is
only better than accommodation when unlike
in the example of the straight line we're unable
to assess every possible good making
feature of the theory or model, OK?
So in a sense then, my answer to the question,
when is prediction better than accommodation?
Prediction is better than accommodation when
and in proportion to when we are in this sort
of epistemically opaque situation of not really
being able to get under the hood of the model
or theory and see what all its virtuous are.
We're using it as a kind of proxy for
what we think might really be going
on namely the good virtuous that
the theory model might have.
OK. So, right, the temper predictivist
says in a straight line case
like it was perfectly clear
what you are up to here.
You were proposing a linear model.
I presumably have whatever priors I
have on the linear model being correct.
And so, I just don't really care that much
if the fifth point that's going to fit
that line gets predicted or accommodated.
But in the other case, I had no
idea what you were up to, right?
And somebody this morning gave me a nice example
of what this might be, if I then told you,
oh what's going on here, this
is housing prices, right?
If I said, oh this is housing prices and what's
going on is they're kind of gradually going
up over time but they're seasonally varying.
Many might go, oh yeah, OK, I see now, yeah,
that's a pretty plausible
model of housing prices.
Now, you're not going to be that surprised and
not necessarily that much more impressed, right?
If I can predict that the next point is going to
be there or there or there or whatever, right?
So the more you know about what's going
on, the more you cannot see inside,
the model the less impressed you are by that
advantage of prediction over accommodation.
That's the claim anyway.
Think about whether you think
it's plausible or not.
OK. If it is, here's its kind of-- here's
the moral of that story if it's right
for complex computer simulations.
I think it complex computer simulations
particularly ones that we are--
that are full of primarizations
and approximations.
And once that we're evaluating them for skill
and once where they're involving a number
of balance of approximations, it's
particularly difficult, I think,
to evaluate whether those kinds of
models have the good making features
that prediction is meant to be a sign of, OK.
Simulations have a number of features
that are opaque in that regard.
And I've got some great examples to that, right?
I mean I was talking to an
astrophysicist who was--
works with the fire code if
any of you know what that is.
And he was-- I said, OK, you know,
I'm going to-- when I turn up--
gosh I don't remember the example better--
if I turn up the rate at which, you know,
neutrons stars explode or something
like that, I expect this to happen.
And then the opposite happens
[inaudible], right.
So in this concept, you know, sort
of detailed complex models like that
where you don't really have
a grip on what's going on,
I think novel prediction is
particularly important, right?
And that's, I think, particularly
important, right,
when you know that your model is succeeding
as a result of balance and approximations
because it's very hard to know, right,
how robust your balance approximations is.
Is it only-- does it only work in
the data set that you tuned into?
Does it work in other data sets, right?
That's kind of epistemic.
They [inaudible].
So prediction is especially important to you.
OK. Two lessons about this.
One for climate science, one for
cosmology and then I'll wrap up.
How am I doing Anton [assumed spelling]?
>> I think you're OK.
>> OK, good, perfect, OK.
So two lessons, one for climate
science, one is for cosmology.
Climate science, because climate
models, right, are-- they're--
those models and their skill are a subject of,
I think, right, significant public importance.
I think we should recognize the following thing.
The opacity of-- the epistemic opacity
of a climate model is not the same
to a climate scientist as it is to us,
assuming there are no climate
scientists in the room, right.
So some of the-- I think some of the--
if you look at these quotations, right,
about how bad is double counting, I think
the answer is it depends who you are.
If you are the climate scientist who built the
model and who knows all of its inner workings
and is only uncertain about whether the balance
of approximations that she's chosen is brittle
or not, then prediction is everything
and a combination is worthless, right?
But if you are lay observer, if
you're lay observer and you're--
this situation is quite OK
to you, right, there--
the ability to accommodate might provide
obviously not as much epistemic value to you
as prediction, but it might be-- it might have
somebody to because you don't know, right,
you don't know the extent to which for example,
the model has accurately captured the physics
versus relied on parametrization or et cetera.
So, the answer to the question, how much
better is prediction and accommodation,
normally depends on the situation
but depends on who you are, I think.
And I think in the climate case, because it's
such an important politically science, right,
but there are a variety-- lay observers, right,
have more stake in whether the models have skill
or not obviously than many other sciences.
And so that's worth remembering, I think.
Cosmology, I just want to give you--
give one little example of a debate
about prediction and accommodation.
So most of you probably know, right, the
behavior of galaxies, if we look at galaxies
that are observable to us, they
spin too fast compared to the amount
of mass that they seem to have, right?
They look that they should be flying
apart given how fast they're spinning
and given the amount of observable
amount of mass.
And the standard view of what's
going on there, of course,
is that they're mostly made
up of dark matter, right?
So there's a lot of matter.
That matter is actually holding things together
more tightly than the visible matter is.
And that's what's balancing out
the centripetal acceleration.
There's a puzzling fact,
right, there's a puzzling fact
if you think that's the right
account of what's going on.
And it goes under the-- sometimes called
the Baryonic Tuller-Fishy relation, sorry,
Tully-Fisher relation and sometimes called
the mass discrepancy acceleration relation.
And what it is?
This is Stacy McGaugh who's a
propounding of abandoning dark matter
and instead replacing the laws of physics to
accommodate this, he's very fond of pointing
out that there's a very, very nice simple
curve that you can use to predict the rotation
of a galaxy just from its observable mass.
Now, that's puzzling, right?
That's puzzling if you think that four fifths of
what's determining the gravitational attraction
in there is the unseen matter, right?
It's puzzling.
And this was a tweet that he--
I think he tweeted this, yeah,
he tweeted this at me a couple of weeks ago.
I think one prime philosophical issue is
that of a prediction versus accommodation.
I can accommodate pretty much
anything with invisible mass, right.
Give me any rotation curve you like.
Just put enough mass in there to
make it balance out and you're good.
What I can't do is use it to predict apiary,
the things that his theory modified
Newtonian dynamics MOND gets right a priority.
But I can always come up with
some story after the fact.
So he's very keen as a defender of
modifying the physics as opposed
to allowing the existence of dark matter.
He's very keen of pointing out
that he can predict us and a couple
of other astrophysicists ask me once and
he said, you know Eric, Stacy keeps saying
that like every time he finds a new
galaxy, that's a new prediction and that's
like a new piece of data against us.
Is that fair?
Is it fair to think like every new prediction
of a new galaxy, that's a new evidence for MOND?
OK, so I think my story here actually
has a nice answer to that right?
And the answer is, it's true, it's
certainly true that it's a virtue of MOND
over the dark matter theory, the Lambda
cold dark matters come in the name--
the full name of the theory, right,
that it can predict the rotation curves
without much greater ease.
But how should we understand this, right?
I think we should understand
this in the following way, right,
that this curve is a nice-- there's
a nice simple power law here.
And the people that propose MOND have a nice
simple explanation of that power law, right.
Their explanation is, that's
built into the force law.
That's exactly what the-- instead of the,
you know, forces proportional to r squared,
it some other formula and it just
read right off of that curve.
And I think, there's nothing
prediction, accommodation, whatever,
it's sort of irrelevant at this point.
This model wears all of its
properties on its sleeves.
Whatever your prior in this
being right is what it is.
And you should no longer be particularly
excited about new predictions, right.
It is what it is.
That's a feature of the universe
that has this curve.
Now, there are, of course,
now two possibilities.
This curve exists because
that's the law of physics
or because it's some really complicated emergent
process, right, of how dark matter emerges
out of the cosmic, you know, out
of the structure that we learned
about from the cosmic microwave background.
That's all about how the dark matter clumps.
Once it clumps in the way it does, the
visible matter, you know, interacts with that
and somehow, out of all of
this emerges this law, OK.
And the-- right, so in Lambda cold dark matter,
that Baryonic Tully-Fisher relation is going
to be a very highly emergent phenomenon.
And it's going to be hard to
understand why we get that.
Now, there have been a few papers and
Stacy, the MONDers are critical of these
and that papers say this
is a nice title, it says,
the "Mass Discrepancy Acceleration is
a Natural Outcome of Galaxy Formation".
All right, so what have they done?
They do these big simulations of the kind
I'd showed you at the beginning of the talk
where all the dark matters like distributed
according to the cosmic microwave background.
Then it clumps together and then once it
gets down to about I think 10 megaparsecs,
that's when the baryonic physics kicks
in and all the feedbacks happen and boom
out pops the Baryonic Tully-Fisher relation.
And now you want to know, OK, have you
actually explained why we observe the Baryonic
Tully-Fisher relation underlying [inaudible].
Now I really want-- In this case,
I really want to know, right,
I really want to know what
parameters did you tune,
what data was accommodated
here, what data was predicted?
And now I need to answer a really hard question
and I don't know how you answer this question.
And it's the question that I think
was really nice to capture it
in Gavin Schmidt's quotation mainly that if
it some irreducible fact about these kinds
of simulations that they're going to be tuned,
that they're going to be tuned to data sets,
how do you decide if the features
that are on the data set used
to tune are sufficiently different from the
ones that you think you're skillfully predicting
to be able to convince your audience or whatever
that you have not double counted, right?
How much covariances they're allowed to be
between your data set and the one that you tune
with and the one that you're
using as your evidence?
And I think, you know, if you want to know,
right, if you were involved in this debate
and you wanted to know how many points
does this count in favor of MOND
and how many points does it count
against Lambda cold dark matter,
presumably when this paper comes out,
that's for my change a little bit, right,
in your epistemic assessment in that situation.
But in order to sight how much it changes,
I think it's a really, really hard question
to know what, you know, how different,
like how much tuning that they do,
what data sets did they tune to, how different
are those data sets, how much, in other words,
how much of that Baryonic
Tully-Fisher relation might have been
in the data set that they were tuning to?
If it was already in the data set that
they were tuning to, then thumbs down.
If it wasn't, maybe thumbs up.
But I think that's kind of, in a way,
one of the most difficult questions
of computer model assessment is, how different
are my test data set from my tuning data set.
Thanks. I think that's, yeah.
Thanks.
[ Applause ]
>> I got absorbed and lost all track of time
but we do have a little time for some questions.
[ Inaudible ]
>> Yeah.
[ Inaudible ]
Yeah.
[ Inaudible ]
So here-- so I think-- So I think I
want to press my point a little bit.
So I think my point cuts a
little bit against what you said.
My view is that in so far as you think that
this just is the law, I kind of don't care.
I kind of don't care whether you predicted or
accommodated because I think this model, right,
kind of wears all of its
epistemic credential on its sleeve.
It's a power law, right?
It's simple.
You either like that or you don't.
And I don't really care.
And I don't actually know which of
this data points was collected before--
I think some of them where there first.
Nobody would have guessed MOND a priori
you had to have some of these data points.
So some of them came before
and some of them came later.
And my view is in a nice, simple,
model like this I don't care that much
about what was predict and
what was accommodating.
In the other case, right, in the other
case where these guys put, you know,
everything in kitchen sink into the model,
dark matter and seven kinds of feedback
and 14 parameterizations and they tuned it some
and then poof out came agreement with this data,
now I'd rather see-- now, I'm really much
more interested in did you accommodate,
did you predict, if you were predicting rather
than accommodating, how different
was your data set.
So I think there's a-- and a way of my point
was that there's a really big spectrum there.
And it's about that complexity of the model.
Yeah?
[ Inaudible ]
With-- if you rely yourself to sprinkle
dark matter everywhere you want,
otherwise it's terrible at
getting the acceleration curves.
Yeah.
[ Inaudible ]
Yeah, right.
[ Inaudible ]
OK. Yeah. All right.
Sorry, we should go now.
[ Inaudible ]
Yeah.
[ Inaudible ]
So-- yeah.
[ Inaudible ]
So in a way, right, part of the word-- the
role of the word skill in the top is to signal,
in particular, that I'm talking about models
that aren't very curiosity satisfying.
So if I show you-- it depends what
you mean by curiosity of course.
But if I show you a climate model-- [Inaudible].
Yeah. If I show you a climate model and many,
many, many climate sciences lament the State
of Affairs, this is a very common thing you hear
climate scientist say, they'll say, you know,
it's great that we've all been working
on this IPCC business for 30 years.
This is incredibly politically important
project but it's corrupted our science.
We've been building these incredibly
large models with everything
in the kitchen sink thrown into dam.
We've been working exclusively on building
models with the best possible degree of skill,
right, that they're the best-- that they produce
the best possible predictions, but they don't--
they're not-- we're not really-- we're
not doing what we went into science to do.
We're not really learning what are
the really underlying processes
in the climate and we need to get back to that.
We've kind have done everything we
can do, you know, for the public.
You either need to-- we people now
either need to heed our warning or not.
But we really need to get back to trying under--
So in a way, part of what I'm talking
about here, part of what I'm talking
about here are models where-- now, of
course, you might be cure-- and you may--
of course, you might-- it provides
understanding to know is, you know,
is there cold dark matter or
is there modified dynamics.
But what I think you don't get understanding
from, for example, right, if you look at--
and if you look at this paper where they get
the Baryonic Tully-Fisher relation to come out,
I don't think anybody would claim that
this is-- that anybody understands, right?
If you're dark matter proponent and
even if you believe these simulations,
I don't think anybody would say
they provide any deep understand
of where the mass discrepancy
acceleration curve comes from.
It just pops out like kind of
magic, if that makes any sense.
[ Inaudible ]
Yeah.
[ Inaudible ]
No. So what it corresponds
to, right, if you go back--
if we go back to that, if we go back to that
slide, the idea was that once we recognize
that there's this probable knowledge, right,
once we recognize there's a probable knowledge,
we have to modify the background
knowledge, right?
Because if we include everything in the
background knowledge and Bayes is going
to say a combination always
is completely worthless.
And that seems wrong.
So Bayes doesn't answer the question, right?
That's why Glymour was--
why I'm not Bayesian, right?
Bayes does not answer the question
when does a combination help?
In order to decide that, we
need to decide what you're going
to leave out of B and put into B prime.
So remember the Einstein case was
that we're going to leave out of B--
even though Einstein knew about
[inaudible], we're going to leave
that out because it's used novel.
So now that epistemic opacity, right, is about
what things are you going to count as used novel
and how much you're going to penalize them.
So, if you're Gavin Schmidt and I'm just, you
know, I just walked into your lab not knowing
that much about climate models and you
say, oh look, look at this beautiful--
look at this, look, I got, you know, tropical
cyclones to form in the gulf of Mexico or,
you know, off Cape Verde, whatever, every--
you know, between June and September.
And now, it's, OK, well, did you know that, did
you know about that before you built the model?
And now I need to decide
if you did know about it.
Am I going to pull out B or not?
And the answer might be, well, Gavin has to
pull it out but I might leave it in there
because I'm just more, because I'm more
ignorant, something that was not used novel,
I might still think counts as evidentiary.
[ Inaudible ]
Yeah.
[ Inaudible ]
Correct. Or whether-- or maybe just as bad,
something that was so covariant with it.
[Inaudible].
Exactly. So in the opaque case, I really
care whether things were used novel
or even if they were-- if they were
sort of superficially not used novel,
were they really covariant with
something that wasn't used novel.
Yeah.
[ Inaudible ]
OK, yeah.
[ Inaudible ]
Yeah. Or we're at least trying to predict like
patterns in the future of the stock market.
If not, the actual exact
value or something, yeah.
Yeah.
[ Inaudible ]
>> OK. Well I think on the last schedule
I saw you have the next half hour free.
>> OK.
>> You got to decompress but--
>> I'm also happy to chat with people so--
>> [Inaudible] conversation.
And then I'm going to have a chance to talk to
you about is if you're up for dinner or not,
so I will answer that question and if you were
interested in dinner, tell them to tell me
or something like that [inaudible] and
we'll see when you twist your arm or--
>> OK. Is there already a dinner
associated with the other--
>> There's a group from Livermore and
there's a reception but if you want
to do even more eating, we can go to dinner.
>> OK. My uncle, and cousin and aunt
live here and I think maybe they kind
of want me for dinner too or something.
>> Yeah, Eric has place for dinner.
That was what I was wishing for, but
you will be at the reception and--
>> Yes. And that's in like half an hour?
>> Yeah. I think half hour a slot and then--
>> Yeah, OK, perfect.
>> We have a little time to chat--
