I'm Jerry Salif the president of SFI and
it's a really great pleasure to welcome
all of you thank you for coming
despite the much-needed rain it's great
to have all of you here
you're missing the president tonight but
I think you're in for a much better
treat than than that so thank you for
being here for our annual Lula memorial
lectures the the series was inaugurated
in 1994 so this is our 20th annual Aloha
Moyer memorial lecture and were really
pleased at that these lectures were
named in honor of the great
mathematician the late stanislaw Coulomb
who worked on famously on the Manhattan
Project and then subsequently lived here
in Santa Fe and as part of his legacy
his scientific library now forms the
core and base of the library at the
Santa Fe Institute over the years the
lectures have been given by a host of
scientific luminaries with ties to the
Institute
the first alum lecture was given by John
Holland on hidden order how adaptation
builds complexity and in subsequent
years scientists this was great fun for
me going back ginger Richardson prepared
this list includes Alan person sitting
here in the first row
Simon Levin Melanie Mitchell Brian
Arthur Murray gell-mann Chuck Stevens
Jeffrey Westone farmer dick Lewontin
Henry right Marcus Feldman
Nina fedorov Sam Bowles then in a
special session honoring Murray young
man's 80th birthday we had so Chris
Llewellyn Smith dan Schrag mark bagel
dan Schrag also then gave on his own
another year mark Newman David Krakauer
and last year
Lord Robert May of of Oxford so our
speaker tonight joins I think of a very
impressive list indeed as always we
gratefully acknowledge the support of
Los Alamos National Bank which supports
our whole community public lecture
program for which we're grateful it's
now my pleasure to introduce Jennifer
done Jennifer is the SFI chair of
faculty vice pres
for science and she'll introduce a 2013
rule Memorial Lecture Stephanie forest
Jen it's my great honor to introduce
Stephanie forest for her first of three
Alam lectures
Stephanie's primary affiliation is as a
distinguished professor in the
department of computer science at the
University of New Mexico Department she
joined in 1990 however it was also in
1990 that she started her long-standing
association with the Santa Fe Institute
and stephanie has been at the core of
SFA's intellectual life and governance
for much of the institution's history at
SFI she has variously served as external
professor resident professor interim
vice president of academic affairs
member of the science steering committee
member of the science board and in July
of this year she finished up a
three-year term as co-chair of the
science board she has been a valued
advisor and mentor to many including me
as I take on the responsibilities of
chair of faculty and vice president for
science it's always a pleasure when
Stephanie pops into my office
a welcome opportunity to talk about
various aspects of SFI life with someone
who has seen and done it all stephanie
is now taking her advisory capabilities
to DC for the next year as a Jefferson
science fellow with the state department
beyond all the titles stephanie has
played a special intellectual role at
SFI as she was one of the first people
to apply complexity science to problems
related to human built engineered
systems having received a master's in
PhD from the University of Michigan in
the mid-1980s in one of the first
computer science departments she quickly
embraced yet another emerging field that
of complex adaptive systems the
application of complexity science
perspectives to engineered systems which
she helped a pioneer is only now coming
to fruition at SFI and elsewhere for
example at SFI there are a variety of
research projects and meetings related
to topics such as the power grid the
structure and dynamics of cities and
slums and analyses of social networks
such as Wikipedia Stephanie's research
has taken her deep into the world of
biology where she and her colleagues
have moved far beyond superficial
analogy between computer science
biology to develop deep interchanges
between the two disciplines she has
published extensively on topics such as
immunology epidemiology evolution cancer
dynamics and scaling theory resulting in
advances in both computer science and
biological understanding and indeed she
has held a secondary appointment in the
Department of Biology at UNM since 2001
over the course of her three talks
Stephanie will touch on a variety of
aspects of the important interplay
between complex systems science
engineered software and hardware systems
and biology as Stephanie wrote in a
recent article for the New Mexican the
level of complexity and the challenges
our computers and networks face have
much in common with those faced by
organisms and even ecosystems she
examines the fertile intersection of
biology and computer science and
searches for the common secret sauce
that makes both computers and organisms
tick tonight ur topic is software
engineering evolving computer programs
tomorrow night she will discuss the
complex science of cyber defenses
computer immunology and Thursday night
will focus on modeling computer networks
from chips to the Internet before I hand
it over to Stephanie I wanted to mention
something else about her background
before moving into the world of computer
science complex systems in biology
stephanie began with a foundation in
liberal arts with an undergraduate
degree in the great books curriculum of
our very own st. John's College as
someone with an undergraduate degree in
philosophy who similarly moved into
science later I think this is a highly
underutilized academic pathway which can
provide a more nuanced and global
understanding of Sciences intellectual
and societal roles please join me in
welcoming Stephanie Forrest
thank you it's great to be here this is
just such a tremendous honor and I don't
know SFI and this community means so
much to me that I've been unusually
stressed out about putting together
these lectures on all my own research
that I should know so well by heart but
I just want to thank all of you for
coming tonight and I'm very happy to
have this opportunity just to share
share what I've been up to in the lab
all those times when I have been hidden
away so I've spent most of my academic
year here in New Mexico and I'm deeply
grateful to the four institutions that
have shaped the ideas that I will
present tonight so I just wanted to take
a couple of minutes to acknowledge these
very special places and the first one is
Jen mentioned is st. John's College
which I have to say started me down the
path to interdisciplinary ruin and
introduced me but it also introduced me
to mathematical logic which in kind of a
circuitous route led me into computer
science and this is not exactly
chronological there have been some spots
in my life when I was not in New Mexico
but my next tour of duty was at the
Center for nonlinear studies where I was
a postdoc for two years and I met Alan
Perl Tsin and that started me down the
path to working on immunology and we'll
be hearing more about that tomorrow
and then I somehow ended up at the
University of New Mexico which turned
out to be the absolute best career move
I could have made and I've just had
fabulous colleagues down there and
wonderful students and I just have to
say I think my university is
underappreciated in the state of New
Mexico we really just have very talented
faculty there and finally of course SFI
which I've sort of used every trick in
the book to have some association with
SFI over the years as Jen mentioned and
it's really been my intellectual home
and everything I will talk to you about
in these three lectures wouldn't
have happened if it wasn't for SFI but
of course it's the people who makes
these institutions so wonderful and I
don't have time to acknowledge all of
you tonight but I just wanted to mention
in particular my students I've had
really unusually good students over the
years and they've provided on most of my
good ideas and they've done all of the
hard work on those ideas and they keep
me on my toes and they in particular
helped me put these lectures together so
they really deserve a big a big thank
you okay so now now we will start the
talk computers are human design systems
that have grown in complexity to the
point that we can no longer comprehend
or manage them using the traditional
methods of engineering and I believe
that and my work is dedicated to the
idea that biology has already discovered
solutions to many of these problems and
so in these lectures we will consider
several examples attempting to
illustrate how SFI style science can
contribute new approaches to engineering
and I just love this quote from Einstein
that I I think my approach is sort of in
that same spirit that we can't solve
problems by using the same kind of
thinking we used when we created them
and that's of course been a little bit
of attention in my career as a computer
scientist so here are some of the
aspects of biology that have fascinated
me over the years both because they're
interesting and because they illustrate
properties that we would like computers
to have biology achieves these
properties by using very different
design strategies than we started out
with in computer science and so the
first of these properties properties is
resilience and I should say this is like
my current list I always have a list of
these design principles and it changes a
little over the years but basically
biological systems are very resilient
and they use many different strategies
to achieve that resilience the ones that
I've been the most interested in are
stasis the idea of continually a system
continually monitoring itself and making
small adjustments to keep it within
normal operating tolerances the idea of
disposable components this is something
that we have really not done a lot of in
computing but biology all living systems
are built from cells and those cells
turnover more rapidly in some places
than others but they all turnover and
that provides an important an important
sense of resilience where the no.1 cell
is crucial to the functioning of the
whole and finally diversity I'll talk
about that more in the next lecture but
biological systems are diverse and that
provides a lot of protection sort of
species level protection that that we
would like to have in computing
biological systems also adapt very
naturally and gracefully to new
circumstances they can repair themselves
when they're injured they and this this
idea about this next item the optimized
networks is something that other SFI
scientist Jeff West and Jim Brown have
worked on a lot but it seems that
biological systems have these transport
networks like vascular systems that
transport resources to all the
components of the system the individual
cells and it seems that these networks
are highly optimized if not optimal that
is they are very efficient efficiently
designed and so that's something that we
would like to have in computers I'll
come back to that more in the third
lecture and finally the thing that we'll
focus on tonight is that all of this
happens through this sort of distributed
bottom-up evolutionary design process so
our lectures are going to build off the
idea that biological systems are really
at heart information processing systems
and that's not the only way to think
about a biological system a lot of other
models of biological systems look at
them as mechanical systems but I base my
work on on the idea that they are also
processing information
I will see where we are okay I will
focus my my lectures around three
problems that that affect all of us
excuse me so the first one is software
and as I will explain in a couple of
minutes we all use software every day
and that software has a lot of flaws and
errors in it known as bugs or witches is
kind of the euphemism and they have
become a significant economic cost and a
significant risk in some cases so that
will be lecture 1 lecture 2 is going to
be focused on the problem of security
and safety on in the online world and
that's important because so much of our
lives have moved online from social
networking and dating to embedded
medical devices and so it's really I
think a problem for all of us it's not
just an esoteric academic topic to
figure out how to secure our networks
and our computers and finally these
applications are all mediated by
power-hungry amazingly power-hungry
computers and increasingly dictated by
high-level governmental policies and
corporate marketing strategies and so
lecture three we're going to explore how
we can start to get a handle on these
runaway technologies through the use of
modeling tools and in honor of Stan ulam
for whom these lectures are are named I
want to say that many of these modeling
tools are derived in some sense from
from advances that that I made so each
of these lectures I'm going to start
with a short tutorial and then I'm going
to introduce some current problems in
the field and just give you some scary
statistics about how bad things are and
then I'm going to get back to the
biology and try to show you how we can
use biology and complexity science to
address them okay so now is the tutorial
yeah oh no not the tutorials not quite
yet I first want to say that this work I
do is I'm not the
only person who doesn't and there's a
field that people refer to as
biologically inspired computing and I
never have liked that term very much I
really like to think of it as the
biological perspective on computing but
fields like artificial intelligence
neural networks genetic algorithms which
we will talk about ant colony
optimization artificial immune systems
which we will talk about and even quorum
sensing are all examples of this kind of
family of related algorithms that are
inspired by some aspect of biological
systems and I should say that the
inspiration is sometimes more direct
than others and I'm kind of agnostic
about how how directly we should take
these analogies so it reams ranges all
the way from very specific mechanisms to
very abstract ideas like the idea of
diversity okay so now we get to lecture
one which is going to be on evolution
using evolution for software repair and
so the tutorial is going to be what is
software and talking a little bit about
the problem of bugs and why it's so
important then I will talk briefly about
how to implement Darwinian evolution in
a computer and then how we can use that
to automatically repair bugs and then
depending on how much time I have left
at the end I want to get back to kind of
the bigger the bigger picture about
software ecosystems ok so I guess all of
you know you use software every day
every time you make a phone call your
kids if not you use social networking
every day when you fly on an airplane
there's a lot of software involved with
that and increasingly the airplanes
don't actually need the people to fly
them every time you do a financial
transaction and more and more we're
starting to see robots in our
environments and and I think that's a
trend that will increase over the next
decade and so all of these examples use
software whatever that is and so what is
software what is
actually look like this is this is a an
example of a very famous program written
in the programming language C so people
write write programs in many different
dialects or languages you may have heard
of Java or the people from Los Alamos
SLO Fortran Python is a popular one
right now
anyway this program is very famous
because it's the first program that
people typically write when when they're
learning learning a new computer
language and so it's very simple the
first line let's see maybe that's not
working that was I've got to robustness
is a good principal of biology ok so the
so the first line is a comment and it
says what the program is the next line
is going to load in a lot of other
programs and I'm not going to tell you
about that handle IO ok they read they
read data in to the computer they get
they get data from the central
processing main memory of the computer
out to a computer screen or out on a
network and then we get to the main part
of the program this main routine is
called main and all this program does is
print out the phrase hello world so
that's why the program is called hello
world and truly it's a very famous
program even though it's very short ok
so how does this program actually
communicate with the computer we all
sort of know one way or another that the
computer deals in ones and zeros known
as bits and this high-level language
that I showed you the C program doesn't
actually have any bits doesn't look like
it has bits so there's this great
invention I think it's one of the great
inventions in the 20th century called a
compiler and the compiler is itself a
computer program but we won't worry
about that it and it takes this program
and translates it into another language
known as assembly code and this also
doesn't have bits it ones and zeroes
explicitly but it's
much closer to the kinds of instructions
that that computers can understand and
much further away from what people can
easily understand
okay so then we've gone we take one step
we get to the assembly code and then
there's one more step which is referred
to as linking that's when all these
other things like standard libraries get
loaded in and loading which actually
translates the program into the lowest
level ones and zeros and copies it into
the CPU of your memory of your computer
or into the main memory excuse me
and once it's been loaded that is copied
into the main memory then if you run the
program yet it will do whatever it's
supposed to do in this case it prints
out on the screen hello world so this
basic process of specifying instructions
at a high level translating it the the
instructions into bits loading the bits
onto the computer and then running them
to see what happens is what we refer to
as computer programming so for those of
you who have kids who do computer
programming that's basically what they
do and unfortunately just like life with
a dog or a child what the computer hears
from after your high level language is
translated into bits is not always what
the programmers intended and and so
there are sometimes these communication
glitches or bugs and as we see here they
can be extremely frustrating and but
they are also ridiculously plentiful and
and so that brings us to the first of
these problems that we're going to
orient our lectures around which is the
problem of software bugs and it turns
out that most of the cost of software is
spent on maintenance so that's this
little thing right here and maintenance
there's a lot of things that go into
maintenance but a huge piece of it is
fixing bugs and so we have the situation
where we have too many bugs
this is just one example I love this
quote so anyone who uses a Firefox
browser that software the software that
implements that that browser is produced
by this Mozilla project and one of their
developers said one time every day
almost 300 new bugs appear and that's
way too many for them to handle so
there's too many bugs they take too long
to fix so even security critical bugs
which as we've learned in the past few
months reading the newspaper are very
important even security critical bugs
take on average 28 days to be repaired
and to have a a patch distributed and
the cost is enormous these are old
figures but I don't know that the number
the fraction has changed that much the
annual cost of software errors just the
errors has been estimated to be as high
as point 6 percent of the GDP and I'll
actually have a worse figure for that
that I give give tomorrow so this is a
really this is a serious problem and in
fact it's gotten so bad that companies
have begun paying strangers to fix their
bugs so these are actually three
examples I don't know I maybe should
have separated them but this is from
Google this is from Mozilla and this is
a smaller company called tar snap it's a
cloud services company and all of these
companies and many others have started
these bug bounty programs where
essentially they invite invite strangers
to find bugs and propose fixes for the
bugs and this started a couple of years
ago and is still going strong the these
values were I think are a couple years
old I don't exactly know what the price
per patch right now is but it's it's
really it's really as a person who's
been in the computer world for a long
time it's really a remarkable
development that these big highly
respected companies are first of all
admitting that they have this many bugs
and secondly actually just opening it up
to the world to fix them for them Oh
so how do we actually repair bugs now
and the first thing is we try to ignore
them many many of the software products
use ship are shipped with known known
bugs and actually the state of the
technology is that we have much more
ability to find bugs automatically than
we do to fix them and that was part of
what motivated this project that we'll
eventually get to so the first thing we
do is try to ignore them and you have
our users be the beta testers and fix
the fix the bugs that the that the users
really notice the second thing we do is
we pay expensive programmers to fix them
manually and and and people have noticed
that that's expensive so we have
developed tools to help the programmers
things like debuggers and profilers and
type checkers all of those are tools
that programmers can use to make
themselves a little more efficient at
manually fixing the bug and then finally
there's a field of research called
formal verification or formal methods
that tries to develop mathematical
models a mathematical statement of what
the program's supposed to do and so the
idea is if you can write down this
mathematical a set of mathematical
expressions then then you could prove
that a program is correct which would be
a really nice thing and this
technology's made a lot of progress in
the past 10 years but it still can't
scale up to web web browsers and really
large-scale production software systems
so now we're going to get back to
biology and our idea was to address this
automatic repair people there was there
is a lot of technology out there for
automatically finding bugs but no one
had really jumped into the water and
said let's let's just try to fix them
and or they had tried to try to do that
but in very very constrained you know
constrained situations where they had
some of these mathematical models and so
our goal was
- just take software off-the-shelf
legacy software and have a generic
method that could fix bugs in that
software and to do it without requiring
this mathematical model and this project
I've been working on this for about five
years and I guess one of the things you
know I'm a little nervous I just want
everyone to know I've had a lot of fun
doing these projects I'm talking to you
about in these lectures and this project
in particular has been a lot of fun in
part because of my collaborator Wesley
Wymer at the University of Virginia he
is an expert in software engineering and
his student Clara Lu quests who wrote a
lot of the original programs and then
two of my students boo when and Eric
Schulte and by now we have quite a few
other students and postdocs involved but
they were kind of the original four so I
said that we were going to try to use
evolution to fix these bugs and now what
I need to do is tell you how how we
could actually do evolution in a
computer this idea was invented by John
Holland believe it or not about in about
1960 and John is he was my thesis
advisor but aside from that great great
thing he can claim he was also a founder
of the Santa Fe Institute and has been a
big impact on the intellectual
development of the Institute and John's
observation was that there are three
basic ideas in Darwinian the Darwinian
account of evolution
first of all individuals have random
variations and some of those variations
make them more fit and they especially
make them more fit in the sense that
they can have more offspring and they
can have offspring sooner okay and the
third thing is that those variations so
it might be I can run fad the animal can
run faster and catch more food might be
more attractive to a mate there's a
whole lot of things that might go into
this differential reproduction but the
key thing is that the these variations
are passed on to the to the next gen
of offspring so John took these basic
three ideas and thought about how to put
them into a computer and the story goes
like this
we are going to have a population of
individuals and the individuals are
going to have variations so instead of
being like animals in an ecosystem or
cells or something these these little
organisms are going to be bitstrings
just sequences of zeros and ones and so
I've got three examples right here
here's one here's two here's three and
when we actually do these as genetic
algorithms we typically have much longer
strings like up to thousands of bits
sometimes more and we typically have
much larger population sizes but that
won't really fit on this slide so we
just have a little population here of
three and we generate those randomly
just using a random number generator so
that's how we get our random variations
and then we have a function called a
fitness function that can assess the
goodness the how good each of these each
of these bit strings are so I always
think of that Fitness function it's kind
of like a judge at a dog show you know
you have all your different dogs coming
into the ring and the judge is looking
at them and going this one's 0.5 and
this one's 0.8 and sort of ranking them
so we do that in mathematics we don't
actually do it in the dog show ring but
you could imagine a function for example
that just counted up how many ones there
were and use that as the as the fitness
measure anyway whatever the fitness
measure is and there's lots of them
we'll talk about one particular one in a
second we use that Fitness to decide
which of these individuals make it to
the next individual excuse me the next
generation so the more fit ones get lots
of copies made the less fit ones get
deleted from the population and just
like in biology they don't get copied
exactly but they get copied with
mutations so like a 1 might be changed
to a 0 or vice versa and they even
sometimes have crossover where two
individuals get mixed up and exchange
information so that we end up with two
offspring that are recombinations of the
original so once we go through that
process we then have the next generation
and we just can reiterate the process
apply the fitness function to those
individuals and keep going around that
loop and the idea is that with just a
very few trips around the loop we can
get high Fitness individuals according
to whatever metric fitness metric we
choose okay so that's all I have to say
about Darwin and that's all I really
have to say about genetic algorithms
we're now going to talk about how to use
this to repair software bugs so instead
of those little bit strings I told you
about we're going to start with a C
program so I'm I'm going to be by the
way the name of our tool is Gen prog and
it's kind of like genetic you know
program repair something like that so
I'm like I'm gen Prague and you're gonna
give me one of those little C programs
like that HelloWorld program except it's
going to be a program that doesn't
behave properly and the way that you're
going to know that it doesn't behave
properly is because you have a set of
test cases and in software engineering
this is referred to as the regression
test suite but you have a bunch of
input-output pairs that you can use and
this is what programmers do all the time
to figure out if their programs actually
doing the right thing and for big
programs you might instead of having
four test cases you might have thousands
or tens of thousands or sometimes
probably even hundreds of thousands of
test cases but in our case we have four
test cases and this little C program
passes the first three of them but it it
gives the wrong answer or it chokes
we'll just assume it gives the wrong
answer on number four so you come to me
I'm the repair tool you come to me with
your program in the source code and with
this set of test cases and you tell me
the one that it doesn't pass and what I
do is
take that program and make 40 copies of
it this is our standard thing we've of
course tried all sorts of other things
but we make 40 copies each of which has
a single mutation random mutation and I
have to tell you what those look like in
a couple slides but they are all all the
40 programs are going to be sort of like
the original program but they're all
going to have little variations and then
after I've done that I take one program
at a time and I run it on all of the
test cases
that's the fitness evaluation and the
ones that do poorly get thrown out the
ones that do well that score have a high
fitness score get recirculated into the
evolutionary process and so when they
get put back in the population they have
more mutations sometimes they have
crossovers and so we just kind of go
around this loop a few times and
amazingly with high probability this the
system often produces a program has
output that can pass all of the test
cases and so I should just tell you that
is my definition of repairing a bug that
the program can pass all the test cases
and you can question me about that at
the end of the hour but I think that's a
pretty good definition okay
so I just need to tell you a couple of
little details just getting back to that
picture that we had at the beginning
about the compiler so I actually skipped
a step this here is the original C
program and on its way to getting
translated compiled into assembly code
it actually goes through an intermediate
step and the intermediate step takes the
sequential line of code program the
sequence sequential list of statements
programming statements and turns them
into a tree a hierarchical
representation of the program and that's
known as an abstract syntax tree which
you don't really have to remember it's
just that we've done most most of our
experiments we've done on this
representation we've also done some at
the assembly code level and the object
code level but I won't be showing you
those results tonight
okay so I kind of skipped the question I
told you about the fitness I told you
about the representation but now I have
to tell you about the mutation and
crossover operators and so imagine that
this is one of these abstract syntax
trees so each of these little dots is a
statement like print or if-then-else or
while those are examples of statements
in this language and we have three
operations three mutation operations
the first one is copy so we just take
this statement here marked in yellow and
make a copy of it and insert it it's
some other random place in the program
total this is all randomized the next
one is delete where we take again a
randomly selected statement and we just
delete it from the program and the third
one is swap where we just take two
statements randomly chosen and exchange
them so those are our three operations
and I have to say it's really surprising
that they work so well I'm hoping that
most of you in the audience are
skeptical about this and these these
operations are notable for two reasons
first of all we don't actually try to
synthesize new code like you think of
programmers as you know typing in new
statements all the time but actually
these statements don't these um
operation operators don't do that they
just move things around in the program
or delete them deletion is surprisingly
effective by the way which we'll get
back to that later okay so and then the
other thing that's important for any
software people computer programmers in
the audience is these mutations we do at
the level of the statement and that's I
just that's a very coarse level and it's
surprising that I mean you can do a lot
of damage to a program just by moving
statements around randomly and so it's
kind of surprising that that works okay
so I don't expect you to read this but
this is a little bit bigger program than
the one I showed you but it's written in
the same way
we'd see so there's just like this is
the next step hello world was
programming assignment number one and
this is programming assignment number
two and so our operators this is the
delete we just delete a line out of the
program we might swap two lines out of
the program or we might copy one line to
another in the program and we also do
crossover I don't have a picture for it
but it's sort of the obvious thing where
we swap two parts of the program okay so
that is our mechanism and it's really
pretty simple so how well does it work
well I don't know how many of you
remember the Zune player this was
Microsoft's answer to the iPod and it
was marketed in 2008 they sold several
million of these little devices so it's
a music player basically and New Year's
Eve of that year
these Zune players started freezing up
and this was a big issue the the help
desk of course got lots of phone calls
and it turned out that the reason it was
freezing up was because of a software
bug and software bug was in this section
of code which is 18 lines long not that
much longer than that HelloWorld program
I showed you and but it's a little bit
more complex of a piece of software
because it has a loop so this while loop
it executes this set of statements over
and over and over again until this
condition is no longer true
and so that's an internal variable to
the program and when that variable gets
bigger than 366 then the program's
supposed to fall out of the loop and
print out the current year so what this
program was supposed to do is take as
input the integer number of days that it
had been since 1980 that was the
internal representation of date but it
was supposed to print out on the display
the current year like 2008
and this was the code that was supposed
to make that translation and guess what
2008 was a leap year and this program
although it had some code in there to
handle leap year didn't actually work
correctly and it this loop just kept
going around and around that's known as
an infinite loop and that was
experienced by the users as freezing up
okay so we decided we got hold of this
code it was posted on a website and so
we decided this would be a good test for
our system and we took so we took is our
negative test case the date sort of in
the number of the number of days since
1980 we made up one other date to be the
to be a negative test case and then we
defined five positive test cases which
was a small number and so then we made
our forty copies we did a few mutations
just like those ones I told you about
and after a couple of generations we
ended up with a version of the program
that could pass one of the negative test
cases and one of the positive test cases
but it had forgotten how to pass the
other four positive test cases and it
still had one negative test case to go
so this is kind of typical how these
evolutionary runs seemed to go and so we
ran it for another couple of generations
and we ended up with this program which
had deleted this line right here this
little sign is a comment sign so it had
deleted that and it had actually
inserted in a separate operation because
we don't have move it had separate had
inserted the statement right down here
and it turned out that passed all the
test cases this was a small program we
could understand it we convinced
ourselves that it's actually a correct
repair and the amazing thing is I don't
know if you can see this down on the
bottom but it did that in 42 seconds and
I claim that there's not very many
programmers if any that could fix that
code in 42 seconds and for sure it
wouldn't have been me who could do it in
42 seconds okay so one point I want to
make about this is that the algorithm
this gen prog did this
without knowing anything about dates or
zoom players or it had no particular
knowledge it was just randomly moving
this code around trial and error until
it happened to pass the test cases okay
so we were pretty proud of that we
actually had one other little program
greatest common devisor that we got to
work and so we were thinking and I've
had this experience many not many times
but a few good times in my life in my
research life we have this phrase in my
research group nothing says it won't
work no no we have no evidence no
contrary evidence so maybe we'll just
take the next step and that's kind of
where we were with this project and so
the obvious next question was well can
we scale this up to larger programs and
more complex bugs
I'm just those citations there are the
papers we published on all these topics
what how does the the time how does that
42 seconds grow as the size of the
program gets bigger can we do it in
other languages in particular we've done
at an assembly code and object code how
good are the repair so we make these
random repairs but you know what if they
just introduce new errors and are there
things that we can't fix and I'll just
say yes there's a few things we can't
fix why does it work that was the
question that I became interested in and
of course the theoretical types want me
to prove to them that the approach works
and I absolutely cannot do that at least
not yet okay so we have spent most of
them I should just go back I should say
that's five years of my life right there
on that slide just in case all of you
wonder what I've been doing so in the
end we've done a lot of experiments but
this was this is our experiment that we
refer to as the many bugs experiment and
so we actually that student Clare did
this she went to the open source
community and found programs eight large
programs many of them you've used most
of you
you've gone to websites have used PHP
those of you who program is probably
used Python anyway these are big
programs that are in being used all the
time and she went through the records
sort of these source code repositories
and found a distribute across those
eight programs a hundred and five bugs
that we could reproduce that I should
tell you is non-trivial and bugs that
were serious enough that a human
programmer had decided to fix them those
were we had a couple of other criteria
those were our main criteria but we went
through systematically and did this and
then we took you to these programs
packaged in them up they all came with
test cases that was another requirement
so LOC is the lines of code and so these
altogether totaled over five million
lines of code and totaled about 10,000
of these test cases and so we packaged
each of these programs up with our
little Gen prog repair tool and then we
sent it off to the Amazon Cloud
Computing service and paid real money to
see how well Jen prog could do and and I
should say that we were optimistically
thinking if it could repair 20% of those
bugs in a completely blind systematic
study we thought we would be doing
really well and we could write a paper
and of course writing the next paper is
always the goal in academia and so we
did it and lo and behold it repaired on
the first it repaired 55 out of 105
that's 52 percent and each of those
successful bug repairs even counting for
the bugs that couldn't fix
factoring in that cost totaled seven
dollars and 32 cents yes we thought that
was really good we wrote a paper about
it and but of course of course there's
that 48 percent that it didn't fix and
so that is the kind of thing we obsess
about we did a few tune ups to our
algorithm some fairly straightforward
tune ups and we managed to knock off
five more bugs that got us to 57 percent
and then we just decided to spend more
money
make bigger populations run them for
more generations and on this data set we
actually got up to fixing 69% of the
bugs
I think 50% is a more you know I feel
pretty confident telling people yeah
this thing fixes about 50 percent of the
bugs it fixes but on this particular
data set we got almost a 70 percent okay
so that was great I didn't really tell
you what those bugs were but I will just
assert that they and some of the their
friends that were not included in that
particular experiment cover many of the
common kinds of programming errors
infinite loops like what we saw for the
Zune bug segmentation faults that's when
the program just crashes and buffer
overflows we're going to talk about that
and a whole series of other security
flaws and we'll talk about some of those
next time but anyway it has fixed a wide
variety of different programming errors
including just the good old fashioned
logic error where it gives you the wrong
answer okay so um how can this be how
can such a simple method do so well and
that actually I'm the most interested in
that question and really trying to chase
it down and then of course the dual is
what does it not do well so one of the
reasons it does so well is something
that I guess a lot of programmers know
but I had not really appreciated which
is that most bugs are small and so to
convince you of this I have a good
old-fashioned Santa Fe Institute style
power-law to show you and I'll explain
what it is but the the x-axis is the
number of lines that had to be modified
to repair a bug and the y-axis is how
many how many programs had that many
lines or fewer and so where did we get
those bugs from well my collaborator Wes
took the software package Eclipse
which is used widely by programmers it's
kind of it's called a development tool
it's one of those development tools and
he went through the whole source the
whole repository of all the changes and
found 20,000 patches where the comment
claimed this patch fixes this bug so
that was our definition of a bug and he
took all those patches and he just
looked at them to see how many lines
they changed in the in the program and
it turns out that 10% of the patches
where two lines or less 20% of the
patches where five lines are less and
those are pretty much easily within
scope of this gen prog tool and 50% of
the patches were 25 lines or less and so
this plot right here is just that data
and the big takeaway message is that
most of the bugs have really very few
changes that you have to make to fix
them so if you're in the business of
doing a systematic study and just
randomly picking bugs to fix guess what
a lot of them are gonna be really small
so that I think is one reason that this
method works and another reason that it
works is that if I do say so myself were
pretty clever and especially Wes is
pretty clever my collaborator because he
knows all these tricks from software
engineering and so some of the
cleverness is that we started with a
working program so unlike the standard
genetic algorithm we didn't start from a
completely random set of bits we started
with a program that was almost working
it just has this one little flaw and so
that makes the program a lot easier the
second thing we did I kind of skipped
over this we didn't oh we don't just
make random changes anyplace in the
program but our mutation operators and
our crossover operators are applied only
to the statements that are executed
during the failing test case so we're
focusing our operations on the part of
the program that is broken and then
there's a couple of other tricks that I
won't talk about
so that that's sort of an explanation
but it's still I think for most people
and certainly for me very improbable
that this completely randomized blind
method can actually fix these bugs and
it's not unlike the monkey sitting at
the typewriter we you know us sitting
there waiting for the monkey to produce
King Lear or something it just that was
my little st. John's plug you know it
there's just something that doesn't
really add up and so that has troubled
me a lot but fortunately I go to the
Santa Fe Institute from time to time I
go to the Santa Fe Institute a lot and
one of the things I've learned at the
Santa Fe Institute is about the this
thing that's called the neutral theory
of evolution and a lot of people at SFI
have been interested in this on in the
context of biological evolution so the
idea the observation is that many
biological mutations don't actually
change the fitness of the organism and
so this is and there's there's a lot of
evidence that suggests that this plays
an important role in evolution and sort
of enables biological evolution and
there's as I this is my reading of the
literature as I understand it there's
two reasons one is just this kind of
buffering idea that if every time you
make a mutation you you kill off the
organism then mutations not going to be
a very good search strategy but if you
no reasonable amount of the time if you
can do some exploration without killing
the system then then you have a better
chance of of finding something good so
that's the buffering idea the genetic
potential idea is a little bit more
sophisticated and so for that I'm gonna
go to this picture that I took from a
paper written by some SFI people and the
idea is that this blob here is a
collection of all the genotypes in a
system it could be biological genotypes
that have a particular phenotype that is
they have identical Fitness and the
argument is
that if you have this so-called neutral
landscape this sort of landscape of all
these genotypes they have the same
fitness that there will be no pressure
on the population and so over time
through drift it's gonna scoop there we
go
spread out around this area and it might
eventually end up with one individual
here that is a single mutation away from
a higher fitness a higher fitness
individual with phenotype B and so
that's my understanding of the theory
there's it's a lot richer than the way
I've explained it I know I've simplified
a lot but we decided we took that basic
idea and wondered if that might be going
on with our soccer programs so we
defined this idea of mutational
robustness which is the likelihood that
one of those mutations the copy of the
delete and the swap what's the
probability if I apply one of those
mutations randomly that the program will
still have the same fitness and remember
for me Fitness just means it can pass
the same set of test cases so we define
this idea of mutational robustness and
then we ran some experiments and much to
my surprise and actually the first set
of experiments was run by an REU we SFI
has this nice summer program for
undergraduates who come from all over
the country and I'd managed to snag one
of those are used one summer for this
kind of wild crazy idea that and so he
did the first set of experiments since
then we've done a lot more and I can
tell you very consistently in all kinds
of programs from very small programs to
large production things with zillions of
test cases about 30% of the time when we
do those me do those mutations it
doesn't change the behavior of the
program on the test cases and I actually
well aside from the fact that I think
this somehow has to be the key to
getting software evolution to go it's
just a remarkable fact about software
and in fact it's so remarkable that
a lot of the reviewers of our papers
have not believed it so it's just a
killer I mean I've had a lot of good
fortune in my career I've gotten some
papers published on the first try that
turned out to be important so I can't I
can't complain but really this is such a
great observation and the reviewers have
really been hard on it but we did
eventually get it published okay so what
how could that be like I just want to
give you I just want to show you how it
could be that you could make a random
mutation to a program and it might not
change its behavior so here's a little
example this is a program called
quicksort it's a very famous algorithm
in computer science and the central part
of the program basically it takes a list
it picks a spot in the middle of the
list it's gonna take in a set of
unsorted numbers and it wants to produce
write out a set the same set of numbers
but in sorted order either ascending or
descending that's what the program is
supposed to do and the way it does it is
by taking making a cut into the into the
array of numbers and then sorting the
left half recursively you know doing
another kind doing another cut and then
sorting the right half but it doesn't
actually matter whether you do the left
half first or the right half first so
this is one one neutral mutation in our
algorithm found it and just could we
switch doing the left first from the
right first you could just switch it and
do the right first and then the left
there's a lot of other things like that
we I don't know how much time I want to
take on this so we took one of these
sorted the sorting programs to me are
really interesting because they're very
minimal they're very small programs and
so it's even more amazing that you could
make these random mutations and not
break the program it's it's much less
amazing to me in the big programs anyway
so we took this program that's just like
a few lines longer than that hello world
it's called bubble sort it's not a very
good sorting algorithm but it's really
easy to write and it's really short
so we took that program but bubble sort
and got some neutral mutations and we
took 35 of those neutral mutations and
sat down and looked at what they were
really doing and this is just the kinds
of things they did some of times it just
makes little changes in the output and
so if you were really fussy about your
output you might say that's not neutral
sometimes it makes little changes into
the internal variables that don't appear
to the output so they're invisible and
but they can still do the sorting job
sometimes it adds in extra computations
it doesn't really need etc etc anyway
that just gives you an idea of what
these neutral mutations might look like
okay so in case you can't tell I'm
pretty excited about this and why do I
think is such a big deal because mostly
we have this idea that programs are
fragile they're like this watch
old-fashioned watch over here that you
know we have this idea that if you touch
it touch one little piece of the watch
it it won't work anymore it'll fall
apart and unless you're a watchmaker
you'll never be able to get that part
back in so that's why I think this is
such an interesting result and it also
along the way I would say we have some
evidence but I haven't definitively
proved that this is why our algorithm
works as well as it does and then
finally I think it supports this what I
like to call the strong biology
hypothesis of computing so I never liked
this bio-inspired term and I actually
think software has acquired these
biological properties like mutation and
robustness through inadvertent evolution
that is the actions of many many
programmers distributed all over the
world over many what is by now many
years okay so I'm going to talk about
that last idea for just a few minutes
before I stop so just to set the stage
until now we've been talking about what
I would call micro evolution that is
fixing single bugs in individual
programs or packages and
what happens if we step back and kind of
a look at at macro-evolution
that is what would happen if we did our
little algorithm over and over and over
again say a hundred times or what would
what do large-scale software systems
look like okay so this is what I think
large-scale software systems look like I
didn't draw this picture but I think it
sort of represents well the state of the
world and our true our current software
infrastructure now is you know in the
true sense of the world word a complex
network with self-interested players
lots of interactions and lots of
diversity and we don't really have good
ways to think about it or manage it I
mean it's a mess out there
and so on Thursday I'm going to talk a
little bit about some things we're
trying to do to understand it but for
the moment I just want to talk about
this system as an as an evolving system
and so one way we might do that is to
take the tack that Brian Arthur took in
his 2009 book the nature of technology
and basically what he argued in that
book I thought was pretty interesting he
said all of technological progress it's
a it's just like Darwinian evolution but
the primary driver is the the crossover
operator it's a recombination of
existing technologies and he argues that
you know that's that accounts for all of
technological process progress and so I
say hmm software resembles other forms
of technological you know software is a
kind of technology and maybe it's just
evolving you know an evolutionary system
as well and actually when the
announcement for this talk went out I
don't know if John's in the audience but
we immediately got someone back
challenging one of the abstracts and I
don't know if you're in the audience or
not but they were challenging this idea
that software is the result of evolution
and so I think that's a good challenge
and it's sort of incumbent on us to
figure out how to prove it and so to be
able to prove that
means we have to be able wait to have a
measure how do we actually measure
evolutionary progress and I'm just going
to hint at some of these answers we're
now getting to the hand wavy part of the
talk and we're almost done
so one answer is people believe that
systems that are more evolved have more
hierarchy in them so food webs are an
example of that economic production
networks and guess what the UNIX
operating system and so this is a study
that some people at MIT did they
measured I think I'm not going to take
time to go through this they measured
these software call networks so they
went through the UNIX operating system
and took all the little components the
little functions and made a graph just
by looking at the cat the code they made
a graph of which functions called which
other functions and then they took that
graph and they measured this thing that
they call flow hierarchy which is the
percentage of links that don't occur in
a loop you know you don't have a
function calling a function you know a
calls be calls C and then C calls a
again that would be a loop and so they
wanted to know how many of the links
were actually not in any of those cycles
and they took that as their measure of
how hierarchically structured the code
base was and I just want to point out
that this is an idea that that one of
the great great grandfather's of our
field grandfather of our field whose
great herb Simon pointed out okay so
what the oh yeah let me show you so what
they did is they took the UNIX operating
system they got the code they got the
code as it was in 1991 and every year
they got a snapshot of what the programs
looked like they did the same analysis
and then they plotted their little
metric they were trying to prove their
metric was better than someone else's
metric so these are the other metrics
but their metric shows that the
hierarchy actually increases over time
so that's one idea for how we might
measure you know the evolution of these
larger scale software systems another
idea is to look at diversity
diversity is another property that is
closely connected with evolution and
ecology and there's a group in France
that's very interested in diversity and
they're studying how this is this is
going to be a little technical how
object or in object-oriented programming
like Java how classes which are sort of
groupings groupings of the program like
functions sort of how classes are used
so they take these classes that are
defined in the java language or common
libraries and then they study there how
they're used in lots and lots and lots
of other programs and what they find is
kind of surprising what they find is
that their use is very diverse these
these classes are called in all kinds of
different ways the functions that are
associated with the class are called
methods and the principal the software
engineering principle is I forget the
name of it but it basically says you
know a good class a good class is one
that is always used in the same way and
the only thing that really varies is the
datatype that's associated with the
class and what they've demonstrated is
that's not true
even for really basic classes like
strings and I apologize for the non
programmers in the audience but I just
wanted to throw that in okay so although
this work on hierarchy and diversity is
in its early stages I believe that we
will see in the next couple of years few
years maybe a decade we're gonna see
much more serious attempts to understand
this complex system of interacting
programs from an evolutionary and
ecological perspective and and I
actually think SFI should have a program
on software complexity so that's my
pitch for the next direction that SFI
should go okay so just to wrap up what
have I told you I've told you about this
generic approach called gen prime to
repairing software bugs it does
use a formal specification and it
doesn't need to know ahead of time what
the bug is it's trying to fix or
anything about it
and except it has to have that test case
but I also want to point out like
obviously I'm pretty proud about this
work and excited about it but it's
actually just a down payment on the real
goal of you know true software evolution
and true automated programming so we
still have a long ways to go
which if you're in the research business
that's a good thing you don't want to
run out of problems and then the second
thing I talked about a little more
briefly is this idea of software
evolution and I actually think it's all
around us and I think when evidence of
it is this mutational robustness
property that I told you about ok so
what's next I apologize such a cute
picture I know it's you know it's a
little blurry but so what's next is that
once we have this complex soup of
software and self-interested actors and
all of this the next thing that's bound
to show up our malicious agents and so
for every time we find a lion that
learns how to eat a zebra sooner or
later there's going to be a zebra that
learns how to get away on a motorcycle
and that's going to be the topic of
lecture too so thank you very much
okay I can't Jenna I can't see so you're
gonna have to call on people
ah right okay so the question yep yep so
the question is I didn't tell you how
many of those mutations it takes to
actually find a repair and indeed we
have more bad mutations than we have
good mutations and I just I just would
have to think about it before I gave you
an exact number but it's way less than
what you would expect just from first
principles and but yes I admit that most
of the mutations are deleterious just
like in nature but I guess the
remarkable thing is that the percentage
is high enough that we can afford to do
it this way
yeah yeah so the question is I didn't I
didn't say anything about like what
probability we use to pick the mutations
we have and I certainly didn't say
anything about any other mutations that
we might have tried that I didn't expose
to the light of day or the light of
night so in fact one of the algorithmic
tuneups we did was to go back on the
many bugs data set was go back and
analyze which mutations were helping us
and the probabilities were wildly
different from what we expected but we
then then took those probabilities and
applied them and that was how we got
that was one of the ways we got the
extra 5 bugs I feel like the answers we
got might be pretty specific to that
data set so we've done a little bit of
playing with it but we don't have a
theoretical way to do that and maybe you
can help me with that
yeah I definitely think there's an
analogy and I'll be talking about that
tomorrow night I guess that's the basic
the the basic answer I guess the the
other thing that I would say is that
once you enter the world of engineering
I mean one difference biology has to be
efficient in some sense to survive but
you know they don't have these engineers
breathing down their necks saying how
much did it cost and can you how fast
did it take and so well it's really you
know incumbent on us to be able to show
that our method can find these repairs
faster and less expensively than people
or other other competing methods
yeah
yeah
yeah okay so the question is yeah well
first of all let me just change it into
a statement that I should have said we
are only as good as our test cases and
if we have really bad test cases than
indeed we might evolve some unusual
things one of the studies that I just
skipped over my my collaborator wes has
the approvals to do human studies and so
he's done some human studies and and
we've done these our little gem prog
repairs looked pretty good to people
that's the bottom line but since you
asked for an anecdote I'll tell you an
anecdote we've also done a lot of
experiment some experimentation with the
assembly level programs and when my
student Eric was doing those experiments
he well programmers are lazy and you
know you sort of want to have to be do
it you want to be doing these these
experiments in the equivalent of a petri
dish so you need to kind of protect your
computer from some of these mutations
and he maybe didn't do as much of that
as he might have anyway so he well he
had a number a number of these programs
that that would mutate and crash his
program and things like that so he
slowly built up more and more of what's
called a sandbox but then he one of one
of the programs evolved figured out a
way to delete all the test cases so I've
got perfect perfect fitness so that's my
best anecdote that I can think off the
top of my head yep
oh dear
ah okay so just in case you didn't hear
that
yeah one of one of the things that I'm
assuming is that you give me the bug you
give me the the program that has the bug
and I guess I would say the mutational
robustness studies we did were on
software without bugs so that's one
example overall we have not tackled that
problem of bug finding mostly because
there are other ways of doing it that
are that can currently find more bugs
than we have time to fix so that that's
you could imagine using something like
this to fix bugs I mean to find bugs
yeah
right so I've done relative people
people have been interested in that and
that's certainly an area that we've
thought about going so the idea is and I
think I had a little item there on my
slides the idea is to use this genetic
algorithm technique to evolve tests
cases that can break the program and we
have not done as much of that as I mean
I always put it into my grant proposals
I always put it into my future work
somehow we never get around to it and
I'm since Chris raised his hand I just
want to point out that Chris this is
Chris Moore he's a resident faculty and
he was one of my colleagues down at UNM
for many years and he's the one who
actually sent around that June bug code
I don't know if you remember that but
you sent that around the mailing list
and so it's all your fault that I'm
still talking about the Zune bug okay
thank you very much
you
