>> Male Presenter: So we're very thrilled
to welcome Michael Nielsen today. He's a legend
in
computer science and one of the pioneers in
quantum computing. In fact, he
co-wrote the standard text on it. And this
document was actually listed in
Google scholars -- one of the top-ten most
cited physics books of all time which
is a pretty weighty distinction. And so, right
now Michael lives in Toronto
where he focuses his craft on writing. And
so, today he's going to speak to us
on his latest book which is entitled "Reinventing
Discovery -- The New Era of Networked
Science. This book delves further into themes
that he first explored in his book, "The Future
of
Science." And so, we'll have some time for
questions and answers at the end and without
further ado, please join me in welcoming Michael
to Google. Thanks.
[Applause]
>> Michael: Thank you so much. Thank you all
for being here today. I was actually at a
event at the Library of Congress couple years
ago. So I was on, 'Michael Nielsen',
and there was also 'Mike Nelson' on the panel
which caused quite some confusion
when audience members were directing questions
to us. So my talk today is about
open science which sits in roughly the same
relationship to science -- basic
scientific research. Mostly Academic research
I'll be talking about. As open source software
does to the
commercial software world. And so, what I
want to explore is the extent to which
open source principles or style principles
can be applied to the practice of
basic scientific research. So I'm going to
start off with an example where this
has been done successfully. So the example
starts with this man -- Timothy
Gowers. Gowers is a mathematician. He's actually
one of the world's leading mathematicians.
He's, amongst other things, the recipient
of the Fields medal, which is often called
the Nobel Prize, or compared to the Nobel
Prize in mathematics. Gowers, in
addition to being a Nobel, Fields winning
mathematician, is also a blogger. It's
not that uncommon actually amongst leading
mathematicians. Of the 42 living Fields medalists,
four of
them have started blogs. So that's about one
in ten. I don't know how
that compares to the general population, but
it's pretty good. Anyway, in
January of 2009, Gowers wrote this very interesting
post entitled "Is massive collaborative mathematics
possible?" And what he was proposing to do
in this post was to use his blog as a medium
to attack a difficult unsolved mathematical
problem, a problem which he said he would
like too solve. Completely in the open
using his blog as a way of posting his partial
progress and his ideas. And
what's more, he issued an open invitation
inviting anybody in the world who thought
that they had an idea they'd like to contribute
to post that idea in the comment section of
the blog. He called
this experiment the Polymath Project. Well,
the Polymath Project actually got
to quite a slow start. In the first seven
or eight hours after he opened up his
blog up for comments not a single person wrote
in with any suggestions. But then a
mathematician at the University of British
Colombia person named Jozsef Solymosi posted
a suggestion. Basically it's a variation of,
a
simplified variation of problem which he was
suggesting. Might be a bit easier
to attack. And then 15 minutes after that,
high school teacher in fact from
Arizona named Jason Dyer wrote a short suggestion.
And just three minutes after
that, Terence Tao also a Fields medalist.
He's a mathematician at UCLA, posted a suggestion.
And so,
things were really off and running at this
point. Over the next 37 days, in fact 27 different
people would post 800
substantive mathematical comments containing
170,000 words. That's a lot of
mathematics done very quickly. It was hard
actually -- I was following along. I
didn't contribute substantively, but I was
following along quite closely. And
it was difficult simply to find the time to
just read all the contributions. It
was really going remarkably quickly. You see
people propose an idea in a very
half-baked form and then very often it would
be rapidly developed sometimes by
other people. Sometimes it would be discarded
but other times it would be
incorporated into the canon of knowledge.
Gowers described his process as being
to normal research as driving is to pushing
a car. And at the end of the 37
days, he used his blog to announce that the
problem had most probably been solved.
In fact, it's like a generalization of the
original problem which they were
attacking. They still had to go to back and
check that they hadn't make any
silly mistakes that everything did indeed
check out ultimately and they wrote
two papers based on it. It took two months
more to do the clean up work, but
the back of the problem had, in fact, been
solved at this point. Now, of course, the
reason I'm talking about this Polymath Project
is not particularly because of
the particular mathematical problem. It's
not important because it solved the
particular mathematical problem. It's rather
important because of what it
suggests. It suggests that we can use some
of these sorts of tools as cognitive
tools to speed up the solution not of simple
everyday problems but actually of
problems which challenge some of the smartest
people in the world. That's really
exciting. These are problems right at the
limit of human intellectual ability.
And not just one particular problem -- but
perhaps broadly across many different
fields. Now, of course, there are a tremendous
number of similarities and
differences to the open source software world.
I'm going to talk not so much
about the similarities. I'm going to talk
a little bit about some of the
differences today. In particular, I'm going
to talk about some of the
challenges and the reasons why this kind of
approach is not more broadly adopted
within science at present. And my focus today
is certainly going to be very
much on basic science. That's the tradition
I came out of. I was an academic
for 13 years. So that's where my focus will
be. I realize it's a little bit
different than Google's focus. But you come
on this journey with me. So I want
to talk about a second example right now partially
to illustrate this idea that
it's a really broad acceleration to be expected.
So I'm going to talk about an
example from a completely different part of
science. I'll start again with a
story involving a single person. It's this
woman Nita Umashankar. In 2003 when
she was 23, she had just finished up her undergraduate
education at the University of Arizona in
Tucson. She went to work for a a year
for a year where she worked for a not-for-profit
organization to help young Indian
women escape from prostitution. Depending
on whose estimates you believe, there are
anywhere between
several hundred thousand and several million
women involved in prostitution in India. But
what she found was really tremendously
frustrating. What she found was a lot of these
young women don't have skills to
hold down a job outside prostitution. So she
went back to the United States
at the end of the year. And after thinking
about the problem, she decided to start a
new organization
called the ASSET India Foundation that would
address what she believed was the
core problem by opening technology training
centers in India and providing these
young women with training in technology then
helping them find placement with
some of India's bit technology companies.
Well, it's seven years later now and they've
opened
training centers in 5 large Indian cities.
And they claim that hundreds of young women
have completed
their programs and been found placement. Which
is great. What they'd like to do, of course,
is to expand. In particular, they'd like to
expand into some of India's smaller
cities. And this poses many challenges one
of which is some of the smaller
cities don't have the reliable electricity
which they need to run their
technology training centers. So they've considered
all sorts of ways around
this. Obviously this causes all sorts of of
potential problems for them. One of
the things they focused on was wireless routers
to access the Internet. So they
experimented with available, commercial, off-the-shelf,
wireless routers. Excuse me -- available,
commercial, solar-
powered, wireless routers. And they found
what was available was not suitable for
their local needs. And the way they addressed
this problem -- have been
addressing this problem -- is quite interesting.
They went to a company
on the other side of the world, just outside
of Boston in Waltham named Innocentive. Innocentive
is actually a online marketplace for
scientific problems. It's a bit like Ebay
or CraigsList but instead of posting a description
of your furniture or car
or whatever you can post a description of
a scientific problem that you would like
solved together with a prize for the solution.
Typical kind of an organization
that posts an incentive would be somebody
like Eli Lilly for example. It's
actually a spin-off of Eli Lilly. But many
other similar organizations make use of it.
This
is probably their best, most famous challenge.
It's a few years old now. It's to find
a biomarker for ALS. They had a $1 million
prize. It's actually a bit unusual; that's
a very large prize. More
typical prize is 10,000 dollars, 30,000 dollars,
50,000 dollars. That kind of
scale. Anyway. What ASSET did was, together
with the Rockefeller Foundation,
they put up a 20,000 dollar prize for an Innocentive
challenge to design a low cost, reliable,
solar-powered,
wireless router that could be made with parts
that were easily accessible to the people
in India.
So Innocentive broadcast this out to their
network. Solvers around the world.
They claim they have a couple hundred thousand
people in their network. Who knows
how many of those are actually active. The
number who actually downloaded the
challenge was 400. What I say downloaded the
challenge, I don't mean looked at the abstract
or whatever.
That's certainly a larger number. This is
the detailed description with all the
IP provisions and a whole bunch of stuff like
that. So this indicates some
reasonably serious level of interest. 27 of
those people submitted solutions.
And the winner was a software engineer from
Texas named Zacary Brown. So he
did a few interesting hobbies or talents.
One was that he was certainly -- as
part of his day job -- an expert in Linux
and other sorts of open source software and
low cost software. But
perhaps more relevant was that at home in
his spare time as a hobby he built
homemade wireless radio networks. And he was
working towards a goal of making contact
with every country in the world. That was
probably number one. The second
interesting talent that he had was--. Well,
he told me in e-mail, actually, when he
was growing up, he'd be watching television
one day and he'd seen solar panels
being installed at the Carter White House.
And he'd asked his parents what they
were. And he was enthralled -- that was his
word -- to discover you could take
sunlight and convert it into electricity.
So as an adult, he was working on
converting his entire home office including
his wireless radio networks to solar power.
So if there was one person in the world that
you wanted to get working on this
problem for ASSET, he would certainly be on
your short list at least I think.
And all Innocentive did was provided a way
of making that connection. So in some
sense, what's going on or what's in common
between the Polymath Project and
this ASSET Innocentive story is in some sense
we already had with the human race
if you want to look really broadly -- already
had all the expertise to solve
challenging and important problem. And what
the two, these online tools did
is they provided a way of activating that
latent expertise.
It was all are there. It was almost trivial.
They activated it. In other
words, they connected the right expert to
the right problem at the right time.
And that's very obviously true in this story
I told you about Innocentive . It's
also true in the Polymath Project. But if
you actually read through the
archives of the Polymath Project . You see
an interesting dynamic. Somebody
says, "Oh, here's my half-baked suggestion
and they can't go any further." And
somebody else says, "Oh, that makes me think
of." And somebody else says, "That
makes me think of." Etc. etc. etc. That's
exactly the kind of thing that goes
on any good creative conversation. But what's
interesting about it to some
extent is that it's just done at a much larger
scale, right? And that's
valuable, because it means that you can bring
in more expertise from more people
from all over the world. In fact, these were
people who would have had a hard
time getting together all in one room. In
some sense what was going on was
restructuring of expert attention or a redirecting
of it. Instead of Zacary
Brown or the Polymath sitting at home doing
whatever it is they would
ordinarily do, working on their wireless radio
network, they were able to use their expertise
in much higher leverage ways.
So it's kind of an interesting broad general
question here which is how can we design
tools to allocate expert attention optimally?
I'm not going to say very much
about this, but there's a few interesting
remarks. One is a technical problem,
which is how to allocate attention where people
have maximal comparative
advantage. So that they're working on the
right things. In addition to that
very challenging technical problem, there's
also an incentive problem. Which is
even if you can direct their attention in
that way, if you have the technical
tools. There's the question of are they actually
going to want to do it? So
you want to align their individual incentives
with this optimal allocation. And
at some level, you can look at lots of different
tools. Certainly you can look
at Polymath and Innocentive. You can look
at lots of other things. I think about things
like GitHub, particularly the issue tracker
on GitHub, IRC channels for open source software
projects.
And many other things like this. Are
all partial attempts to solve this problem
where you can all view them through
that kind of lens. And I think it's interesting
to view them through that kind
of lens. So I'm going to switch away from
this topic. It's something I talk
about more in my book. But I don't have time
to talk about today. I do think
it's an interesting general way of thinking
-- treating expert attention in
diverse types of expertise as a scarce resource,
as though you have a resource
allocation problem. Something I'll actually
just mention this. Something I've often
wondered. You're really talking about designing
markets here and whether or not
there are connections to ideas which have
been developed around things like Ad
Words and whatnot over the last many years.
I don't know the answer to that question,
but
I'd be interested in people's comments. Okay,
new topic or slight
change of topic. What I want to talk about
is why scientists have been so
inhibited in adopting many of these tools.
And they certainly have been, if you
look broadly across the sciences. So I'm going
to tell the story of a few
failures of ideas that in my opinion should
have succeeded. Certainly more than
they have. So the first one is a site called
the Qwiki . This was started by
a grad student in Cal Tech name John Stockton
in 2005. It stands for Quantum Wiki. It was
a attempt to start a research level type Wikipedia
for the quantum computing
community. So his idea was to recruit people
essentially to write almost a
super textbook for the field. It would be
very rapidly updated with the
information about the latest discoveries in
the field. Discussions about the big open
problems. People's speculation about how those
problems and so on. Good idea.
I happen to be present at the workshop where
it was announced at Cal Tech in 2005. It was
very interesting to chat with people at that
workshop. Some of them were
extremely skeptical. "this is a total waste
of time. Why would you ever want to
do that." Actually, quite a lot of large fraction
of people were very friendly to the idea and
extremely excited
"You could have conversations. Oh, you could
use it to do this. You could use it to do
that. You could share
experimental protocols." Etc. etc. etc. . You'd
have these 5-minute
conversation. And then say, "so what are you
planning to do?" "oh, no no no –
I don't have time to contribute. But, jeez,
I hope someone else will take the
time to implement these brilliant ideas."
And of course, if enough people think
that way, it was inevitable that ultimately
it would fail. I should say by the
way, I've picked this example partially because
Mr. Stockton did a lot of things
very right. Did not fail because he messed
up. He seeded it with lots of very
good material. He certainly tried to twist
various people's arms so that they
would contribute material. He did a reasonable
job of marketing it and so on.
It did not fail just because he messed up.
And you can repeat this story across
many many similar science wikis. They often
contain lots of great content
contributed -- it's always contributed by
just one person or maybe one person
with one or two buddies. You know, that's
a very typical kind of scenario. You
look at the Knot Atlas kind of similar. String
Theory Wiki. And there's actually similar
stories. I should say actually when I say
they failed, I'm just talking about in terms
of recruiting a large
number of contributors to collaboratively
build this. They're actually often big
successes in terms of lots of people want
to access these sites. The download
statistics are often pretty impressive. But
yeah, lots of people want to read the
content. They just don't want to contribute
anything. Okay. So similar sort
of idea. Just picked a couple at random. Scientific
wikis and scientific
social networks. This is kind of the Facebook
for scientists idea. The notion or idea being
to connect scientists to other scientists
with complementary
interests so they can share code, share data,
share ideas and so on. And there's
been a lot of money bought into developing
these. I don't know maybe several
dozen of them. I just randomly picked a few.
And you create a account on such
a site. You log in. And typically what you
find, at least what I found, is
they're often virtual ghost towns. There's
just nobody there as far as I can
tell. And again, it's not because it's a bad
idea, a priori. Often they've been
well done, well executed, but people don't
want to use them. Okay. So let me
just get a drink. What's going on here? As
I said, that's just two of many
examples. I'm going to move away from the
slides for a second here. So what's
going on is kind of obvious. But it bears
being spelling out. Particularly if
you're a young scientist, although there's
variations on this that's true for
all scientists. Particularly if you're a young
scientist, you want to get a job
doing what you love. And you know what that
entails. It entails working 60, 70,
80 hours a week working towards the kind of
things that are going to get you
such a job. And this does not entail making
contributions to the Qwiki
. If you have to trade off spending two or
three hundred hours writing one or
two really mediocre scientific papers that
nobody's ever going to read versus
spending the same time making a long slew
of brilliant contributions to the Qwiki
or scientific social network, you know, whatever.
Like, there's no comparison from the point
of view of your tenure committee or of the
grant agency. Clearly you should
spend your time on that mediocre scientific
paper. So, you know, that's the
straightforward career calculus there. And
it is certainly the case that, if
you -- you know, even if you believe wholeheartedly
in the power of something
like the Qwiki If you believe that's the best
invention since sliced bread,
it's still a very hard thing to ask to spend
a whole lot of time on it. It's
kind of tantamount to career suicide. So that's
a really strong incentive for
people to stay away from things like this.
I'm talking about what are the
reward structures. What types of contribution
are going to be rewarded? Is it
simply writing papers and getting grants?
Or will the scientific community
value other types of contribution as well.
So what are the incentives there for?
What are the rewards there for? So I'm going
to talk a little bit about
changing the culture so people start taking
seriously the idea of using some of
these tools so they're incentivized and rewarded.
But first I want to address a
little side issue. You might say how does
something like Innocentive or the Polymath
Project
fit into this story? Well, actually it fits
in just right. If you look at
something like the Polymath Project, yes,
it's certainly an unconventional
means which is being used, but at the end
of the day, they've written scientific papers.
That's the output. So it's an unconventional
means to a conventional end. So
in that sense it's a conservative project.
When you contrast it with something
like the quickie Qwiki, it's not building
towards a paper in any sense. You are simply
contributing to it for its own sake. And Innocentive,
there's an even more
basic motivation. You're looking for a cash
prize. Really direct version. Okay.
So I want to talk about two cases where in
fact the culture has changed in
really significant ways. So the first example
concerns the Human Genome Project.
So, actually does anybody here have a background
doing sort of genetics
sequencing? So, you're familiar with the Bermuda
principles, that kind of thing? The Bermuda
Principles? It's kind of ancient
history at some level now. If you go back
to the early 1990s, it becomes very
clear -- or it's very clear certainly to people
doing genome sequencing that the
human genome is going to be sequenced sooner
rather than later. It's just a
matter of time at this point. But if you look
at online databases like GenBank
where you can up load the data. There's a
real question about why would a
scientist actually choose to up load their
data to such a database? At some
level it's kind of like contributing to a
site like Qwiki. It's not something, necessarily,
at this point in time that your tenure committee
is going to care. It's not a
paper. You're simply helping your competitors.
This is very clear to
everybody who's involved that there's not
a whole lot of incentive to do this
data sharing. But nonetheless, it's clearly
in the community's best collective
interest -- certainly in humanity's collective
interest. There's a meeting in
Bermuda in 1996 convened by the Welcome Trust.
And amongst the people present
were the leaders of the Human Genome Project.
Craig Venter, who would later lead the private
effort to
sequence the genome, was present. Representatives
from the
Welcome Trust were there and representatives
from the US NIH were there. So
they sat and talked about this problem for
several days. Basically what people
were willing to agree was that yes, they wanted
people to share the data, but they
weren't willing to unilaterally go first and
themselves and do it. So they wrote
what are now called the Bermuda Principles.
Which they basically stated two
things. One was if you took more than a thousand
pairs of data, that that data should be
up loaded to GenBank or a similar site within
24 hours. And second, that the
data would be put into the public domain.
And it wasn't just an empty agreement.
What happened was actually the representatives
from the NIH and the Welcome Trust went back
to
the grant agencies and within 12 months they
were baked into policy at the grant
agencies. Which meant if you wanted to get
money to work on this, you needed to
agree to abide by the Bermuda Principles.
So it's a great way of providing a
collective change in the behavior of that
community. There's actually a lot more that's
happened
since then to ensure that human genetic data
is broadly shared. There have been several
follow up policies from
several major grant agencies. Another notable
thing that happened was in April
of 2000, Clinton and Blair issued a joint
statement essentially praising the
Bermuda Principles. They don't actually name
them but they describe them
broadly urging every single country in the
world adopt similar principles. this is a
nice story but of course human genetic data
is this tiny fraction of all human scientific
knowledge. Even if
you just look at other types of genetic data,
the situation is much more patchy. A biologist
of my acquaintance who's
done a lot of open source work. He approached
me after a talk and his comment
was that he'd been sitting on a genome for
an entire species for more than a
year. So that's a whole species of life whose
genome is undergoing bit rot on
his hard disk somewhere purely because his
collaborators haven't gotten around
to running the appropriate analyses. And of
course, there's lots more knowledge
of that kind broadly across science. When
I give talks, sometimes I will ask at
the start, "How many people -- to scientists
-- how many people systemically
share their data." Not how many people will
respond to an e-mail request to
share scientific data or whatnot. But how
many people systemically share it.
And except in a couple of fields, invariably
two or three people in the audience
will raise their hands or sometimes none.
So very small fraction. And more
broadly, in most parts of science, not just
data, but also co-chairing of the
situation is all over the map but it's not
particularly good. And then there's
a whole lot of other knowledge. All sorts
of tacit knowledge. People's half-
baked ideas, questions and the observations
that are either raw input. Oftentimes the
most
important raw input in scientific papers remain
locked up inside people's heads in laboratories
And are not actually being shared broadly
where they can be used by other people
and built on in the manner that we saw in
the case of PolyMath Project. So I told
you I described two stories of a cultural
shift towards a more open scientific
culture. The second story is much bigger than
the human genome project. But I
have to go back all the way to the dawn of
modern science. So I go back to 1609.
It's December. Galileo has built his first
astronomical telescope. Seven months
pass. For whatever reason he doesn't look
at Saturn during that time, as far as we know.
But on the morning of July 25,
1610. He points his telescope at Saturn for
the first time. And what he's
expecting to see is a little disk; that's
what he's seen when he's looked at other planets.
But what he sees isn't a little disk. It's
a little disk and it's
got two noticeable bumps on either side. And
what he's seeing is of course the
first hint of the rings of Saturn. His telescope
actually doesn't -- it's not quite good
enough to resolve the rings. That would have
to wait till Christiaan Huygans some years
later.
But he knows straight away this is a monumental
discovery. It's kind of hard to
think, at this point, how can that be a big
discovery? But, of course, our
knowledge of the heavens had barely changed
since prehistoric times at this
point. So it really was a big thing to have
learned. Does he announce this to the world?
No.
What he does is, he writes down a description
of the discovery in his notes.
And then, he scrambles the letters into an
anagram. And he mails that anagram
off to four of his astronomer colleagues.
Straight away within a day, including Keplar,
as one of the people.
And what he's ensuring is that if Keplar later
makes the same discovery, Galileo
can later reveal the anagram and claim the
credit. But in the meantime, he
hasn't disclosed anything at all. And of course,
it's not just that Galileo was
a bad guy or whatnot, but this was common
at the time. Leonardo did it.
Galileo did it. Huygens did it. Newton did
it. Robert Hooke revealed Hooke's Law in this
way, which you may remember from high school.
There was no incentive at the time to share
information. And so, people kept it
secret as long as they could. Or the incentives
were certainly much less clear
than they are today. At least today it's in
your best interest to publish a
paper. So the standard kind of scientist way
of solved that in 1665 with the
invention of the scientific paper. But actually
the story is much more
complicated than that. In fact, the editors
of the first scientific journals
had tremendous difficulty in convincing people
to reveal information. It took many
decades. So here's -- this is a -- basically
Marie Boas Hall she was the biographer of
Henry Oldenburg who was the first editor of
the Philosophical Transactions of the Royal
Society. She describes how Oldenburg would
"beg for information,"
sometimes writing simultaneously to two competing
scientists on the grounds that it would be
"best to tell A what B was doing and vice
versa, in the hope of stimulating both men
(it was men at the time) to more work and
more openness". She had these nice examples.
Two people working on essentially the same
problem and Oldenburg would kind of
bounce letters backwards and forwards sort
of insinuating that the other one was ahead
of where they really were at in the hopes
this person would then feel that they had
to say, "oh, I
can do that too." Kind of, he would then publish
distillations of these correspondences
in the Philosophical Transactions. Funny kind
of a way to be doing science.
Another similar quote. This is Elizabeth Eisenstein
one of the scholars of the
printing press. She's talking about late 1600's.
So exactly the same time
period. She's talking about "Exploitation
of the mass medium [books] was more
common among pseudoscientists and quacks than
among Latin-writing professional
scientists, who often withheld their work
from the press." Well, this is 220
years after Gutenberg, right? It's not next
week or next year or even
next decade. In fact, it's not even next century.
It's 220 years. It's the
time gap that separates us from the French
Revolution. And it really did take a
major transition over many decades. So this
is an interesting question. What
caused the transition? Well, the immediate
thing you can say is that what caused
the transition was somehow establishing a
link between publication in journals
and career success. The link that academics
today take completely for granted
as though it's somehow holy writ. It was a
social process that went on there.
But this begs the question, you know, what
actually really caused it? What
underlay this transition? How was that link
established? First of all I should
say that the transition did take certainly
decades. Arguably longer than a
century. It's sometimes called the open science
revolution. So to boil down –
it's a very nice paper that's been written
by Paul David at Stanford about how
this happened. To boil his 120-page paper
down to two words it's patron
pressure. So to explain what I mean by that,
let me give you an example of
patron pressure more or less in action. This
is my example, not David's, but I
think it illustrates his point very well.
The rings of Saturn was not Galileo's
first big discovery. Arguably it was the four
Galilean moons of Jupiter which I
think was made in January of 1610. So what
happened? Well, you know, Galileo
sees these moons. And actually he published
that discovery pretty quickly. Not
at all similar to the story of Saturn, right?
He published a pamphlet. But
before he established the pamphlet, first
of all he contacted several
potentially rich patrons including the Medici
family and said, "I will name
these moons after you publicly if you agree
to become my patrons." So here's
Galileo's announcement. Notice anything about
what gets the most -- possibly
there's two things. The title, "Sidereus Nuncius"
which apparently stands for "starry messenger".
Galileo's own name is considerably smaller
than
this announcement of the Medician moons. It's
too bad for Medici that we now
call them the Galilean moons. They kind of
lost out there. But the point is that the
funders have different incentives. In particular,
the funders have much more
incentive for openness than the scientists
do. Medici want to -- they wanted the
kudos that went with having this known for
them. And David's contention and I think
he's basically correct is that in fact it
was this fact -- this pressure from
the patrons -- that ultimately caused this
transition to the more open system
that we do have today. Now it's interesting
to think back to my genome story –
the Bermuda Principle story. There's actually
a really strong parallel. The
genome is open in part not because of patron
pressure, but funder pressure in
part, essentially the same thing. It's not
just funder pressure. The scientists involved
certainly work
together. But the ultimate governance mechanism
was the fact that these people
had a lot of money and the ability to withhold
it if people weren't willing to
share data in the greater good. So I would
say certainly today that the grant
agencies, the NSF, NIH, and other large grant
agencies, there's a lot of potential
for them to work towards a lot stronger open
data policies, than they currently have.
It's not just the human genome data they have
policies for; they have a number of data types.
But there's a real
possibility to greatly broaden those policies
and also to encourage people to
share data much earlier in the discovery process.
In a similar kind of a
fashion of course there's a lot of valuable
scientific information tied up in
code which is siting on hard disks inside
laboratories and grant agencies could
potentially work towards open code policies
particularly for critical parts of
analyses. Anything critical to the analysis.
Another thing which grant agencies
could do today potentially is they could start
to legitimize tools simply by
encouraging scientists to submit nonstandard
evidence of impact. For example writing a
job
description -- actually I'm switching from
grant agencies to an academic. If you
write job applications and you just add one
line that encourages people to submit
nonstandard evidence of impact bits. That
will have a tremendous amount of impact. So
that, you
know, switching back to the grant agencies
again. It should be possible to use
contributions to sites like Qwiki or uploads
to github as evidence of impact. I
would go so far as to say I certainly believe
that in fact publicly funded
science should really be open science. Now,
there's a lot of caveats to this
statement. Just a couple. There certainly
need to be exceptions for
confidential and proprietary knowledge. But
as a broad, general sense, we could
certainly work towards a world in which publicly
funded science is open science.
You know, there's a tremendous amount of room
there for people to move. And
we're only moving very slowly in that direction.
The reason I wrote this book
is at least in part to make open science a
public issue. And that I believe
will serve two ends. One is to help encourage
the scientific community
internally to have a serious discussion about
what types of contributions it
values. Why is it the case that uploading
scientific data to online
database is something, in most fields, not
presently seen as part of people's job. That
kind of
question. But at the same time it's also possible
to have a public discussion
of what type of scientific culture we want
to support by public money. At the end of
the day, the public is paying by definition
for publicly funded science, so they
should have some sort of impact. Just a couple
of remarks about my own personal
experience and what sorts of things I found
useful to do as a working scientist.
These are -- they're all very small steps
I should say, but I feel like these
are things which many people could do. I certainly
have made a point through my
career -- it's maybe not so unusual as a physicist
to make sure that my papers
are freely available online. So about 80 percent
or so of my scientific output
is available freely online in some form or
another. A little bit more
adventurously -- I've certainly explored using
my blog as a way of conducting
not just sort of science outreach or high-level
discussions but in some cases
very detailed technical discussions at sort
of a research level entirely in the
open. With mixed success. I mean, something
that continues to amaze me –
several posts -- a very technical level -- things
like this series of posts on
the Fermionic and the Jordan Wigner Transform
which have tens of thousands of downloads.
I'm not quite
sure who's reading it, but there's apparently
a market somewhat to my surprise.
A little bit less successfully or something
I recently started doing that hasn't
been all that successful is uploading sometimes
-- so for example, this is the
version of this series of posts which I've
up loaded to git hub so people could
potentially contribute polar requests and
[inaudible]. And so on. There's been a bit
of interest but
not a huge amount of doing that at this point.
I'm also involved in a very
minor way with Polymath Project hosting the
Polymath Wiki. We're probably in
some sense my largest single open research
project -- it's not exactly open
science -- was in fact construction of my
book over a period of four years. So
using GitHub to bookmark all of my -- all
of the research which I did for it.
Which is then piped through to friend feed.
Which enables discussion around
that. And so, if you look in the background
to my book, there's actually
thousands of discussions with hundreds of
people there all done in a sort of
super lightweight sort of a way. Totally transformed
my experience certainly of
writing. It was very useful. And parts of
the book were also written in the
open in so far as my publisher was willing
and able to allow that. Looking much
more broadly at some of the organizations
which are involved in doing open
science work, one organization which has done
really tremendous work and is
almost unknown is the Alliance for Taxpayer
Access. So this is an organization
which has done a lot of work lobbying, particularly
in the United States for
open access policies and open data policies.
At the moment they're trying to
get the Federal Public Research Act passed.
What this will mean -- if you receive
funding to do scientific research from a U.S.
agency with a budget of more than
100 million dollars a year then within 12
months of publishing your results it will
need to be openly available online. If that
act is passed, that will be a big
deal. The Alliance For Taxpayer Access played
a major role in lobbying for a
precursor policy, the NIH open access policy
which does the same thing for NIH funded research.
So
over the next few years. In fact, in Google
search results, when you search for
biomedical things, you'll start to see a lot
more papers. And that's really an
outcome of lobbying done by them and other
organizations. So I think that's a
tremendous thing. Creative Commons which many
of you will know but not
necessarily for their scientific work have
also done similarly lobbying work.
And they've also worked in particular with
a number of companies to urge them to
share pretty competitive data and have had
some major releases and done a lot of good
work. And there are dozens of other companies
and organizations; I just
mentioned a few here,. in case people are
interested. And I'm very happy to
chat afterwards in more detail. Just one final
note. Something I would love to
see, which is something which I think Google
could reasonably easily do is in
fact if Google scholar -- it's just an idea.
I don't know how feasible it is.
So many decisions are made in terms of hiring
-- you know, I've sat in hiring
meetings where people have sat in meetings
searching through Google scholar for
people's citation records and so on. There
are some non-traditional forms of
contribution which show up in scholar results.
I guess within physics,
Preprints tends to be the one that does show
up in Google scholar results. But
there's the potential I think to start to
include more non-traditional forms of
contribution. I've seen a few blog posts show
up in Google scholar results. It
certainly would be more interesting if that
was done more systemically and
potentially start a conversation within the
scientific community about how
seriously those kinds of contributions should
be taken. Anyway, it's just a
suggestion. I'd certainly be happy to chat
about it more and thank you all very
much for your attention.
[Applause]
