So computerphile's done quite a bit on
recursion, but so far we haven't talked
about the idea of "tail recursion", which
is a simple trick for making recursive
programs more efficient. And really, it's
something that every programmer should
know about. So that's what we're going to
look at today - tail recursion. So as we
often do with these things, we're going
to start off with a simple example, and
the example that we're going to look at
is the factorial function. So what the
factorial function does, is it takes a
number, like four, and it's simply going
to multiply all the numbers from four
down to one. So we'll do four, times three,
times two, times one, which hopefully
equals 24. And this is a classic example
of a function that you can define
recursively, with two cases. So we have a
simple base case, which says the factorial
of one is going to be one, because
if you multiply the numbers from one down
to one, there's really nothing to do -
you just simply return the number one
straight away. And then we have a
recursive case, which says, if you have
any other positive number n, what you're
going to do is you'll take the number n,
and you multiply it by the factorial of
n minus one. So it's recursive, because
we're defining the factorial function in
terms of itself - we're saying the
factorial of a number n, like four, is going
to be n times the factorial of its
predecessor. So it's a recursive function,
that's defined in terms of itself. Let's
make sure we all understand how this
function is actually operating, with a
simple example. And we'll find it's
actually very inefficient, and this is
where tail recursion is going to come in,
to help us to make this function more
efficient. So let's look at our simple
example - so we do factorial of four, and
we simply go to our definition, and that
tells us that factorial of four is going
to be four times the factorial of its
predecessor, which is three. And then we think,
well what do we do now? Well, we're simply
going to take the factorial of three, and
we're going to expand that as well. So we
copy down what we had before. We've got
four times, and then we've got factorial
of three, and I'm going to use brackets here
to make the structure, or the grouping,
explicit because that's going to be
quite important. So we're doing factorial
of three, which is three times the factorial
of its predecessor, which is two. And then we
simply do the same again. We unwind the
recursion one more time. We copy down
everything we had before - we have four times
three, and then we get in brackets, two times
the factorial of one. And now we finally
reach the point where the recursion is
going to stop. We've got a factorial of one
here, and by definition that was just one.
And then finally now we can start to
actually perform the multiplications. So
we'll do two times one. So I'm not going to
skip any steps here. And then we do the three
times two to get six. And then finally we do
the four times six to get 24. You can see by
kind of expanding the definition here,
that we've been able to see that the
factorial of four is 24, using our simple
recursive definition. But unfortunately,
this definition is actually quite
inefficient, and it's inefficient in
terms of the amount of memory it uses.
And you can see this just by looking at
the structure of what I've written down
here. We start off with factorial of four,
which is small, and then we get a bigger
expression here, and a bigger expression
here, and a bigger expression here. And
this is because we're unwinding the
recursion, and we need to unwind, or apply,
all the recursion before we can actually
get to the point where we do any of the
multiplications. And then we can do the
multiplications, and things shrink down
again at the end. And it's got kind of a
triangular shape - you start off with a
small term at the top, or a small
expression, and as you go down, the
expression is getting bigger and bigger,
until you finally reach the end of the
recursion, and then you can start doing
the multiplies, and it all shrinks down
to the end. So it's like a triangular
shape. So this kind of shows us, with the
simple example, that this is potentially
inefficient, in terms of how much memory
it uses. And you can imagine, for example,
if you calculated the factorial of a
large number, like a million, you'd have
to build up this very large intermediate
expression, counting down from a million
down to one, building up all the
multiplications in the middle, before you
actually get to the end, and you can
start collapsing this whole thing down.
So this function works - it has the
correct behavior - but it's not
satisfactory, because it uses too
much memory. So we can fix this using the
idea of what's called tail recursion. So
what I'm going to show you now, is how to
redefine this function in a way which
gives the same answer as before, but is
actually much more efficient. And we'll
see, afterwards, that this is what's
called tail recursion. So I'm going to
redefine the factorial function, and I'm
going to use a little helper function,
which I'm going to call go - and this
is often the name that people use for
these things. And go is going to take two
parameters, or two inputs. The first one,
will be the number we're trying to
calculate the factorial of - that's just n.
And the second parameter is going to be
what's called an accumulator, and this is
just a simple value that we're going to
use to build up a running multiplication
as we go along, rather than waiting until
the end, and then doing all the
multiplications backwards. We'll do the
multiplications as we go along, and we'll
use this extra argument to accumulate
all of those values for us. So how does
the go function itself actually get
defined? Well there's two cases for this,
just like we have two cases for the
factorial function. So the first case
says, if you're trying to calculate the
factorial of one, then you don't return the
value one anymore, you can just return your
accumulated value. And if you're trying
to calculate the factorial of some
number n, and you've got an accumulated value a,
what you're going to do is call the go function
- you'll decrement the value, so four will become
three for example - and then the trick here is that
you're going to take your accumulator
value, and you're going to multiply it by n.
So we've got a simple recursive
definition here, very similar to the way
the factorial function works, but we're
going to see that this actually gives us
a much more efficient definition. So
let's look at the same example as before.
We're going to look at factorial of four,
and see whether this is more efficient
or not, and actually check whether it
gives us the same result as well. So
we're going to call go, and we'll have
four, and then we'll have our initial
accumulator value being one. And what do we
do? Well, we just go to the definition, and
it says we subtract one, from the first
parameter, and then we're going to
multiply the one by the four.  So if
we work out the details of that,
it's very simple, we're just going to
get three and four. Now we just simply
repeat the process. We say what is go of
three and four? Well, we call the go
function again, and we decrement the
first parameter. And then we multiply the
two parameters. So in this case, we
multiply the three and the four. So we
get twelve. And then we go one more time.
We go. We decrement the first parameter,
so we get one. And then we multiply
the two in the twelve, and we get 24.
Now we're finally down at the base case, so
we can stop. And we simply give back the
result, 24. What you see here is that you
get the same result as we had before,
factorial of four is 24, but now it's much
more efficient in terms of memory. And we
can see this, simply by looking at the
structure of what we have. Here we have
go applied to four and one, then go apply
it to three and four, and so on, and so on.
So the term here, or the expression that
we're manipulating, is simply a function
applied to two inputs. That just uses a
very tiny amount of memory - we don't need
to use enormous amounts of memory to
build up a large intermediate expression,
which then we collapse down at the end.
We're just using constant amounts of
memory, as we go along. And this is
achieved, because the go function is
what's called tail recursive. Let's have
a look at the definition of the go
function. In particular, let's look at the
right-hand side of the recursive case.
So, the right-hand side tells us how to
calculate go of n and a. And if we think
about how this is actually evaluated, we
would first of all subtract one from n,
then we would multiply a and n together,
to give our new accumulator value. And
then finally, the last thing that we do,
is do the recursion. So this is the idea
of tail recursion, when you make a single
recursive call, and the very last thing
you do is make that recursive call. So
there's nothing to do afterwards.
And actually, this is what's making this
definition more efficient than our
previous one, because we don't need to
remember anything after we make the
recursive call to the go function. And
this is in contrast to our original
definition of the factorial function. If
we look at this definition, and we think,
well how does this right-hand
side here, in the recursive-case, 
case get evaluated. Well, what we would do
is we'd first of all subtract one from n,
then we'd call the factorial function,
and once that had returned, we need to
remember that we still need to do the
multiply. So it's not tail recursive,
because the last thing that we do, is not
applying the recursive definition - after
the recursive call, we still need to
remember to multiply by n. And that's the
source of our inefficiency here. So
that's one example of tail recursion. Let
me show you another example of tail
recursion, and what we're going to look
at is what's called the Fibonacci
sequence, which is a very famous sequence
of numbers in computer
science and mathematics.
So the Fibonacci sequence begins in a very
simple way - we simply have zero and one.
These are going to end up being the base
cases for our definition. And then, the
Fibonacci sequence proceeds, by simply
adding together the two previous values.
So you add the zero and one, to give one.
You add the one and the one to give two.
You add the one on the two to give three.
And then hopefully, I don't mess it up.
And the sequence proceeds out to
infinity. So this is the Fibonacci
sequence - how do we define this as a
recursive definition? And there's a very
simple way to do this, but the very
simple way is actually going to be very
inefficient, but we can make it efficient
using tail recursion. So let me show you
the inefficient way first of all. So we
can define a function, which is going to
take a number as an input, n. And then
it's going to give us back the nth
Fibonacci number starting from zero,
because computer scientists always count
from zero. So the base case for the
definition - if you want the 0th
Fibonacci number, that's the first thing
in the sequence, that's zero. If you want
the 1st Fibonacci number, counting from
zero, that's the second thing in the
sequence, so that's going to be one. And
then if you want the nth Fibonacci
number, you take the two preceeding ones,
and you add them together - so fib of n, is
fib of n minus one, plus fib of n minus
two. So that's a nice simple recursive
definition, but unfortunately it's
horrendously inefficient. For example, if
you tried just calculating Fibonacci of 50,
even using quite a fast machine, even
after a few minutes, it won't have
terminated - it's going to take an awfully
long time to do that.  And
there's a number of problems
with the definition - it uses double
recursion here, we're making two
recursive calls. That potentially could
be a source of inefficiency. We're also
not using tail recursion, because once
the two recursive calls finish,
we still need to remember that we need
to add the two resulting numbers
together. And actually, there's another
source of inefficiency here - if you think
about it, this is recomputing the same
Fibonacci numbers all of the time. So
let's see, how we can use the idea of
tail recursion, to make this recursive
definition more efficient, or actually
extremely efficient. So let's rewrite it.
We use the same kind of trick as we did
before. So we're redefining the Fibonacci
function, and it takes a number n. And
we're going to call a helper function
called go. And it's going to take n as
its first parameter. And this time, it's
not going to take an accumulator as its
second input, it's going to take a pair
of numbers. And the pair
of numbers we're going to
take, are simply the current Fibonacci
number, and the next one. So initially
we're going to have zero and one, because
those are the two first Fibonacci
numbers. And then as we move along, we're
going to kind of shift that little
window on the Fibonacci sequence along.  So
initially it will be 0 1, then 1 1, 1 2,
and so on. So I'm going to be shifting
that little window along. So this is what
the go function is going to do for us. So
there's three cases here. So we could go
with zero, we could go with one, and we have
a pair (a,b). Or we could go with n, and we
have a pair (a,b). So we can think how do
we define the three cases? So the two
base cases are very simple - if I want the
0th Fibonacci number, I've already got it,
it's simply a. If I want the 1st one,
it's simply b. And then if I want the
nth one, all I'm going to do is call the go
function, decement the 1st parameter, so the four
would become three. And then I take my little
window (a,b), and I'm going to move it
along one. So if a is the current
Fibonacci number, and b is the next, and I
want to move along one step, then b will
now be the current Fibonacci number, and
a plus b will be the next one. So this is
the idea of just moving along a step. So
a simple example with that,
would be if I take the pair (2,3), 
and move it along, the three will
move into the first position, and we add
the two numbers together and we get five.
So that's kind of moving one step along
in the Fibonacci sequence. So here is my
definition, for the go function, and
notice that it's tail-recursive - it makes
a single recursive call, and it's the
last thing that we do. So in particular,
if we think how does the right-hand
side here actually get executed? Well
we'd subtract one from n, then we'd form
this pair a, and a plus b, and then the
very last thing we do, is we make a
single recursive call. Okay, there's no
more double recursion anymore. So let's
just check that this actually works
correctly, and that it is actually more
efficient. So let's try calculating the
4th Fibonacci number, using this, and
see what happens. So we apply the go
function. Four is the first parameter, and
we take zero and one as the current and
next Fibonacci numbers. And then we just
unwind the recursive definition. So the
four is going to decrement to become three.
And then we move our little window along
one. So one is going to become the
current Fibonacci number, and we add the
zero and one to get the next one, so we get
(1,1). Then we continue, so we go, we
decrement the three to get two. The second
one becomes the first component of the
pair here, and then we add the one and
the one to get two. Then we continue. So
the two is now going to move over - we're
shifting along one, the two becomes the
first thing, we add the one and the two
we get three. And then, finally, we reach
one of our base cases, and that will be
the second component of the pair, which
is three. And we can just double-check
that that's correct - and we go back to
the Fibonacci sequence, and we're looking
for the 4th Fibonacci number counting
from zero. So there's zero, one, two, three
four, so we find that three is the
correct answer here. The point here is
that because the go function is tail
recursive, we get this result extremely
efficiently. If you look at the structure
of what we have over here, it's very
simple - we just simply have a function,
applied to two numbers, or three numbers
really - four, zero and one. Same here, same
here, and same here. So we don't have a
computation, or an expression, which is
growing and shrinking - it uses constant
memory here. And it can run
extremely fast because of that.
So that's all for today. It's just been a
quick introduction to the idea of tail
recursion. And the idea is, basically, you
can sometimes make recursive functions
more efficient by making recursion the
last thing that you do. And it's a simple
trick which really every
programmer should know.
