You can get results from this where you can't get results from lasers.
Lasers get bleaked out in sun light
I had a colleague that I was speaking to who went to Mexico to do crop scanning
and he had a laser scanner, and he had to do it at night
in a tent, because the sun wrecked the laser scanner
and there were wolves about, and it was a big problem for him.
If you had just used a camera, you might have found that you've got to work harder on your stereo matching
but there are things it will do that laser scanners can't
so there's going to be a time for one and a time for the other
The top tip for the day is: "Use a stereo pair of cameras, don't get eaten by wolves."
Yeah, that- that will be my advise.
[typing sounds]
We find corresponding points in our left and right eyes
and then we can use that to work out how far away from us something is.
When we have an individual eye on its own, we have some monocular cues
some monocular clues, as it were, that we can use
to find out depth, or at least to estimate depth
but true 3D only comes from two eyes.
In a single eye, you might have something like: the object is bigger than it was before
so it's coming towards us
or one object is passing our view faster than another
and that parallax. And that gives us a clue that it's in front of something else.
Occlusion is an obvious one.
If something actually is in front of something else, we can make some reasoning about that.
So, our brains will take those monocular clues...cues, and
do something with them and work out what's going on.
But when we have two eyes, then we can do actual 3D depth perception.
Uhm, the classic example is, of course, magic eye things that were around in the 90's.
I'm not very good at seeing those.
I kind of cross-eyed and it kind of works but it's all a bit backwards.
But the idea there is that we trick our eyes into seeing slightly different images
and that gives us a perception of depth.
If we've got a stereo system, what we...the main thing we have to know is where are our cameras.
Our brains know where our eyes are because they've learned it
but one's here and one's here, you know. People maybe have slightly further pair of eyes.
uhm, but your brain will account for this.
If we're going to do this mathematically using a computer, we need to know
where these cameras are.
If you know that we've seen an object in one view
and then we go into the other view, we need to try and find corresponding points.
Without knowing where your cameras were, your search space has increased.
You've got to look at the whole image. Maybe you get points confused.
Maybe there's a corner that appears multiple times
because it's like a book and it has four corners.
And then you've got to try and resolve which one's which.
Uhm...And some of these features won't appear in both views because of occlusion.
So, if you take your left and right view of my hands,
you know, some of my left hand is going to be visible in one eye that isn't in the other eye
and that's a huge problem.
So, what we do is, we start with a process called camera calibration.
We have two cameras that are nearly next to each other
and we don't know exactly what their angles are
but we can find that out by using camera calibration.
We have to take the picture from both cameras at the exact same time
because otherwise the scene is going to have changed.
So, we'll assume we're taking pictures with the cameras at the same time,
something that isn't true of some visual reconstruction system.
We take a picture of this board and we calibrate the positions of our cameras
and then we move the cameras and take a picture of something we're trying to reconstruct in 3D.
So then we have a situation where we have one image here
on this side, ugh, left- left view
and we have an image here [marker sounds]
which is our right view
In our previous video on the Raytrix, we talked about lens and system in front.
We will do away with that just for simplicity's sake
and we'll say that these are pinhole cameras.
Because we're using a pinhole camera, we'll send our cameras somewhere behind
this camera plane
So, some object in the world projects its light down here
intersects our image plane and then goes into the camera origin like this.
We have optical center of our camera and any light rays
coming from this object here
are going to travel down this ray
intersect our image plane
and then go into the optical center of the camera.
And this will happen for any points in our scene that this camera can see.
We want to say, we've got a point on this image plane
Where did it come from? And the crucial problem is that it could have come from here
or it could have come from here, or here, or here, or anywhere along this ray.
And we don't know
and that's what we're trying to find out.
That's the depth problem.
Now, we also have an optical center for this camera.
which is here, and rays will be coming out and intersecting through these points
So, if we knew
that this point in this image
was this point in this image
then we just project the rays. We find whether they intersect
and that's...we use simple triangulation, simple maths
to work out how far away that position is.
We don't know what point that is
because it's going to change- it might not be visible in this image.
That's one problem.
The search base is quite large.
Reliably finding the exact point as this
in a different image when it might have rotated and changed slightly
is a lot of work in two dimensions.
And you've got to do that for every single pixel in this image.
You've got to find maybe one that tries to match in here.
That's a lot of work to do, so we don't tend to do that.
We use something
a nice observation called Epipolar Geometry
to try and make this a little bit easier.
If this is our intersection point
x1
and this is some object x,
all the way out there
and we're trying to find out how far away it is
We need to try and make our search in this image
a little bit easier
So what we do is, we imagine that this is part of a big triangle
coming out. So this is one corner of our triangle
This is another
or it could be this
This x is somewhere along here
and comes in through
this point. So,
Let me get a different pen
That may make things easier
We can draw a ray
that goes from this optical center
to here
and from this optical center to here
and from this optical center to here
to any of these
points
and they intersect this image
like this
and what this is is our epipolar line
so this line ere
through these points is
all the possible projections
of this ray into this image
So now, we've simplified our problem
because we know where these cameras are
we can say, we're trying to find this
position x1 in this image
by knowing that it's going to be somewhere along here
we know it's goint to be in this
line here
so we've got a limited set of pixels
we now have to look for
so what we need to do
is go for each of these pixels in a list
and say, "Which one of them looks most likethis?"
and then we find it
and then we find our triangulation point
and we find out how far it is away
is this because you know where the cameras are? Yes.
It's only possible because we know where the cameras are
if we don't
then we have to search through the whole of the other image
and it takes ages
one edge of our triangle's between the optical center of the cameras
one is through this point and out into the world
and the other is some
value we don't know
which is going to be along this line
because it's just a flat triangle cutting through this image
which makes it a lot easier
to find out where these things are
what we will do, if we're writing a reconstruction algorithm
is that every point in this image
and maybe we'll do it backwards
as well for completeness
for every point in this image
we will try to find a point along its particular
epipolar line
that best matches it
and then, of course, you can go much more complicated than
that. You can try and find
the global image map
between here and here
which is a combination of not only
the best feature matches
but also
uhm
you know, it needs to be nice and smooth
objects don't tend to go back and forth a lot
so you want them to be rounded
so you have to bear that in mind
finding a point
in this image
based on another one from this image
is called the correspondence problem
and that's really the core of what
of what we're solving here
[typing sounds]
finding the included pixels is hard
and there are approaches based on this
where they not only try to find what we call the disparity map
the difference between
this x and this x
