[ Music ]
>> Good morning.
Today I'm gonna throw-- give
two lectures before we go
to the workshops and the first
one is explaining a little bit
the empirical modeling
we used in facial--
facial reconstruction
of prediction,
better prediction for markers.
So, I will give quick
outline here.
There is some are
mathematics here but I will try
to explain this as best as I
can and if you have questions,
please go ahead and ask me.
So, here we're-- basically
the outline is basic.
And I'm gonna explain
the empirical models
versus theoretical here.
I think of modelling and
machine learning, the history,
the application, the types,
and then regression models
which basically parametric
and nonparametric modelling.
The reason we-- we wanted
to explain the history
or a little bit about
the mathematics that went
in the next lecture is to
give you an idea where we came
up with these numbers
that used to help
in the facial reconstruction.
So, empirical statistics
calculating probability
or information about an
event from past exper--
experimental data, determined by
data from an actual experiment.
The theoretical statistics there
on the other hand is calculating
probability or information
about an event based
on sample space
of non-equally likely
outcomes determined
by finding all the possible
outcomes theoretically
and calculating how likely
the given outcome is.
And so that's the
difference between the two.
The empirical really you need
to have experiment and you--
you get the output
from this experiment
and use it in your models.
So, machine learning is a term
that I want to explain here
because some of the kernel
regression that we-- we use in--
in the facial reconstruction
is based on strong roots
in machine learning and
that's in computer vision
and some of the work we do.
So, machine learning is
basically learning the change
in the system that are
adaptive in the sense
that they enable the system
to do some task or task drawn
from the same population
more effectively next time.
The whole idea of
machine learning
as you can see is it's always
driven by empirical data
from real world and the goal
is to teach computer how
to use this data to
predict future data
or recognize certain
events or objects.
When I want to say teach
computer here means algorithm
and software driven.
So the computer by
itself will not learn.
Of course, you-- you have to
write the programs for that.
So you have the world
and I have the prediction
and you have a new event,
then basically you can't
predict what is the new event.
For example here, if you teach
your computer some patterns
like this is sample--
example here.
So, you learn previous patterns
and then you have
something new coming
and then basically what your
software should predict,
whether it's a cat or it's
a-- or something else.
That's basically what we
mean by machine learning.
We did this actually in some of
my work and this light on top
of the left is basically some
of the work that I have done
in the gender knee where we
introduce the concept that men
and women differ in
orthopedic industry
which the-- for a while said no.
I mean they have one implant
that fits everyone or a number
of sizes that fits
men and women.
We proved that this is wrong
by machine learning,
by experimental data.
We proved that the medial side,
medial and anterior-posterior
and lateral are different.
That was a simple example.
As you can see the flow
from learning here it was
by model statistics and
two peaks which means
that they are different.
So, the prediction model
is getting this data in
and then predict
sizes for-- if--
if a new population or a new
person, so it's based on--
the machine will learn
and then you predict.
So, optimize machine learning is
optimize a performance criteria
using example data
or past experience.
Rule of statistics inference
from a sample and the rule
of computer science,
really, it's the part
that I said we call
it computer here.
It's efficient algorithm to
solve the optimization problem
and represent and evaluate
the model for inference.
Machine learning, human
expertise is absent,
for example, navigating on Mars.
Basically, that's
all machine learning.
Humans are unable to explain
their expertise like in the case
of speech recognition.
Here, we use machine
learning, vision, language.
Solution changes in
time like stock market.
All these predictable models,
they are machine learning.
Solutions need to be adapted
to a particular case
like biometric.
In the case of biometric,
the problem size is too vast
for our reasoning capability.
For example, like
calculating what pages ranks.
I mean this is the examples of--
of how you use machine learning.
Some of the examples,
probably a lot of you have seen
or know before like looking at
patterns of ears, finger print,
actually so it's used
sometimes in imaging
and ultrasound for attenuation.
Here also in-- in the case of
using infrared cameras to look
at vain and this is have--
every person has its own pattern
and it's using biometrics.
So it's-- this is some
of the very obvious applications
of machine learning.
Other applications you
may not be aware of,
we're using the same science
as like retail, market--
basket analysis, customer
relationship management
and biometric, it's used all
the time, voice recognition,
fingerprint, iris and finance
and credit scoring,
fraud detection.
In manufacturing,
optimization, troubleshooting.
In medicine, in medical
diagnosis and telecommunication
of course optimizing surf--
surface quality and
bioinformatics, the famous
and a lot of you know the DNA
sequencing and gene expression,
trying to understand
the patterns.
And then used by web
mining for search engines
like the big Google, all the big
search-- you're using the same--
same methods actually.
The types of learning machine,
learning can be divided
into supervised reinforcement
learning and unsupervised.
In the supervised, you also
have-- you have classification.
You can supervise a
classification as an expert
by providing certain labels
because you know the problem,
regression and unsupervised
is where the--
you really don't have the
expert here and you try
to understand from
the data itself.
And unsupervised you have
clustering association
and then dimensionality
reduction
which is basically we have a
huge dataset like for example,
dimensionality reduction
I mentioned yesterday the
principal component analysis.
So you take large amount
of data and you want
to reduce its dimensions so
you can't put it in a form
that you can run statistics on.
Reinforcement learning is
basically situation and reward.
The examples here like if you--
chess game, building a software.
Reward is winning
the game at the end.
In tennis game, the same
reward that each point scored
and in case of dog training,
treat with every good deed.
That's-- that's what we
call reinforcement learning.
It's not part of
the work we're doing
but that's one of
the main examples.
Unsupervised learning, no labels
or feedback, no expert
basically.
So, studies how input
patterns can be represented
to reflect the statistical
structure
of the overall collection
of input patterns.
No out-- outputs are
used unlike in the case
of supervised learning and
reinforcement learning.
And-- and the outputs here, we
have, you put them into clusters
and you try to get
density estimation.
The same example I
mentioned earlier
about the need was actually
unsupervised learning
when we wanted to know the size.
So, we did not have a
pre-idea of what are the sizes.
We didn't look at what
companies have implant sizes
because I knew it's not correct.
So, we did some clustering here
and completely unsupervised.
The clustering here
of hierarchal
and we would get some example,
hard clustering K-means
and soft clustering
like fuzzy C-means.
Clustering here basically--
hierarchical clustering
basically use more
into the gene expression
type of experiments.
And in-- in the case
of other clus--
like hard clustering K-means,
you decide that you may
want this data to be divided
into a number of clusters and--
and according to the distance
between the data point, it will
put number of clusters for you.
And the soft clustering,
which was an example mentioned
like early yesterday
was use of fuzzy in it
which basically is
called fuzzy C-means.
So, you don't know
the boundaries exactly
between the data and there's
some fuzzification going on.
But according, the
algorithm will try its best
to cluster the data.
The cluster application
said in our case,
this is in our case, really.
Optimum num-- optimum
number of implant sizes.
This is as you can see this,
the plots here showing,
this is a cross population which
showing that the need differs
between black, white
and Asian population.
That's an example of K-means
where when you, from the data
that we have a difference in--
in the sizes between
ancestral again,
ancestral population, okay.
[Laughter] We call them ethnic.
And then another example
by the segmentation,
which is using fuzzy C-mean
here was we try to segment
and this is in the
other grant with NIJ.
>> You would try to
segment the skull thickness
and here we use the--
the fuzzy C-means
because sometimes it's hard.
But-- but to actually manually
go and segment so with the help
of the fuzzy C-means, we
could actually calculate
or segment the thickness
of the skull.
Supervised learning
needs an expert,
that's on the other hand.
Divided into two phases and a
lot of you are probably aware
and they are in a
lot of applications.
The first phase is training.
Of course, because you
have an expert, the expert,
you need to train the data.
And for a dataset,
expert provides labels.
So the expert will come, that's
happened a lot actually more
in the medical-- medical
imaging or diagnosis
or of course FDA sometimes
like there are a lot
of nice work was done on
cancer detection from--
from mammography or-- and the
expert here will say we look
at regions and I didn't
find these regions.
But FDA here does not
really like the computer
or the software to
make decisions.
So that's why it's very hard
to-- to standardize this.
But this is an example like
where the expert will come
and he-- he or she will-- will
put the labels and the goals
to find most probable model
generalize the training data.
And testing use inferred model
predict label of new point.
So, the whole idea is you train
and then basically if you--
if you're faced with a new
problem, then you're able,
according to this training,
to predict for the new point.
Of course, a lot
of problems coming
in this, I'll explain later.
So examples of supervised
learning,
handwriting recognition
that this is used a lot.
Data from pen motion,
character recognition,
that's another example.
Scanned document as an image
and that turn into words.
They have optical
character recognition.
Disease diagnosis, as
I said those properties
of patient symptoms left the--
that that could be used, too.
Face recognition is very famous
and used a lot in security
and other areas where you
picture of a person's face
and then the person
name, not like the--
also the way like some
anthropology try to--
to use face recognition
to identify person.
It's used also more in the
computer vision area and pattern
where you can actually identify
person within thousands of--
from a camera and that's--
that's other face
recognition applications.
That harder-- a little
bit than having a picture.
And then of course spam
detection from email.
All of you see spam and
everyday and basically
that there are software
out to detect spam
and that's also another example.
Now, classification
is very important
and basically classification
as simple as is trying
to separate two datasets
from each other.
So, in this case, example,
ancestry detection.
So here we have the nasal
breadth and the nasal-- nasion--
basion, and this
is the number of--
if you look at this carefully,
you find that the blue coming
into the red and you need to
find a model that can separate
and this is basically a model
that can really go and separate
between the two populations.
The problem is we come
into a problem here
called over fitting.
That's very specific.
The more you-- you
get complicated with--
with separating the--
the data like that,
the more you can run
into over fitting.
Over fitting is you tailored
your problem completely
to your problem.
You come with something
else, the system fails.
Works 100 percent, 99
percent in your data
but goes out then fail.
So, in the classification
techniques,
of course the linear
classifications
and then nonlinear.
Examples of linear like precep--
perception learning that's
like early neural networks.
You are very familiar
in anthropology
or physical anthropology
community
with linear discriminant.
There's also nonlinear
and discriminant but--
and then the newest
support vector machines
and this is the latest
and I'll give example here
for the use of-- of
classification techniques.
In the nonlinear, you have the
back propagation neural network.
That's where the neural network,
if you heard the term, comes in
and then we have the
radial business functions
and then nonlinear
support vector machine
and decision trees.
So, here-- here for example,
some of the classification
applications that we have.
We-- for example here, this
is some work of Emam here.
He did way back with some
of other people was
chromosome classifications
and then we did the
patellar sexing that was--
we spoke about yesterday.
This is another example
of-- of classification.
Kinematic classification,
from my work on implant design
which we actually can-- can
classify the different--
different types of
motion according
to the design of the implant.
In our work, with
NIJ skull sexing, 90,
all these use classifications.
Density mapping that
we spoke yesterday
about also use some
sort classification
to know exactly the
different regions.
And then we use also
an imaging enhancement
where you have an
image coming like noisy
and then we can actually
basically-- denoise the image.
Di-- all these use,
applications use classification.
Now, the regression portion--
portion was in the
empirical modeling here.
To find a function-- functional
description of data was the goal
of predicting values
for a new input.
That's the whole
idea of regression,
predicting a new input.
So, if you have a giving data
and find prediction function F
of X that can predict the
value of Y from X. So, that's--
that's as simple as here.
Example like regression,
you have a lin--
all of you know reg--
linear regression or
nonlinear regression
where you feed the data in.
Now regression also divided into
parametric and nonparametric.
And in the parametric,
as we said which
require the expert here,
you have neural network,
support vector machine
and linear regression.
In nonparametric which you
leave that no expert basically,
you have the nearest
neighbor, weighted average,
kernel regression
and locally weighted.
Our work really in
the next lecture and--
and basically the software
we're gonna use based
on the kernel regression.
So in the parametric as we said,
includes linear regression,
neural networks, support vector
machine and other techniques
that map relationship--
relationship [laughter] in data
by optimizing different
parameter values,
using a dataset similar
but not exact.
Once the parameters for
the model are identified,
the training data
are no longer used
and the models prediction
equation is set.
As I said earlier, the problems
you have, the over fitting,
in the case of a new data
model has to be retained.
The data if you can run into is
parametric into data fitting.
We'll give example
here of, for example,
something like neural network.
We do not use neural
network but we're giving it
as an example of-- of a kind of
parametric and training dataset.
So, in a single preceptor,
when you have a linear--
a linear, and the example we
gave here like problem and, or.
If you have one ended with
zero, then it is zero, okay.
And, or, a very simple--
simple logic problem.
It did as simple as this,
the single preceptor
which is linear failed in a--
in a case like X or
which exclusive or.
The problem is solved in
using nonlinear technique
like the Malta layer back
propagation which is not li--
nonlinear consist
of multiple layers.
I'll give the example very
soon which has the input,
output and hidden layer
of neural networks
that does few things
but solve the problem
and actually the whole idea
is create multiple high--
hyper places when you go
beyond the three of dim.
You have a lot of data.
If you have a two
dimensional, you can--
you can separate
by a straight line.
If you have three
dimension, by a plane.
Well, if you have a
four dimension problem,
we call it hyperplane.
It's something you
cannot visualize
but we can relate
back to-- to do them--
2-dimension and 3-dimension.
The single preceptor was-- came
from neural network was big
in the '80s and '60s
from '60s and '80s
and '90 is now the support
vector machine is coming.
But was basically
the whole idea is--
was exactly modeling
and you neuro--
ner-- that's as simple as.
So you have the output
coming from number of input
and then you have
threshold exactly
like how the-- a
single neuron work.
And then basically, in the case
of the problem here, the neur--
problem here is separating
the problem.
So if you have 1 on zero or
1 on 1, 1 on 1 and and is 1,
so basically, it's
solving the problem.
Alright, so-- but this could
not be solved in the other case
down here which you
have exclusive or--
so if where we have 2 inputs,
then you take the
opposite of the input.
So if we have 1, 1 the output is
zero or 1, zero the output is--
so, it did fail the--
the linear.
Then they went actually to the
multilayer back propagation
where they start increasing
the weights until--
and then train the system
until actually was solved.
This is a classification
problem as simple
as what was solved very
complicated was the multilayer
back propagation.
So the data is presented as
input layer and then passed
to a hidden layer and the
problem was neural network
and maybe it's not that I
haven't seen a lot of work
with anthropology and
neural network except
when we did one paper
on the patella were
used neural network plus
discriminant analysis.
The problem is the neural
network requires the expertise
of how many hidden layers.
I mean, that comes from express.
A lot of tuning parameters
work from the expert.
>> So it's not really-- that's
why it's not universally used
in a lot of appli-- application
and the support vector machine
which is based on more rigorous
mathematics is coming actually
to replace neural networks.
That support vector machines,
where vector machine has a
math-- mathematical strong,
mathematical basis
which is basically--
if you look at these 2 datasets,
you want to find the best plane
that really separate
the two datasets.
So, in the case of linear, find
a plane that maximize margin
and support vectors
are the points
which the margin pushes against.
That's the margin.
So we have a plane
separating these two data.
And now it's easier
in the linear case
and the nonlinear case,
we use a different method
but support vector machines
is very reliable and strong
in separating datasets.
Now, here's the example
in a nonlinear case.
Now, you know, this case, how
you separate this dataset,
the blue from the red, unless
you draw a circle around
but that's again, that's
not the linear case.
So, the-- something
called the kernel trick.
The kernel trick is taking
the problem from one domain
which in this case inseparable
and you multiply by
certain function.
It takes the problem
into a nonlinear
dimension but separable.
That's the trick here.
So, it really separates the
data from here into this domain
and basically, you have a plane
that can separate the
red from the blue.
That's in simple term, how
you go with data inseparable.
I use this work in some of my
students including Dr. Emam here
and others, which we
for example want to know
from MRI data whether we--
this is a cartilage or a
bone, MRI very difficult.
So, we use this to know whether
this point is a cartilage
or a bone.
So, that's-- now, in the
nonparametric regression,
there's actual training data to
understand future predictions
and store the training
data in a memory matrix.
Rather than modeling whole input
space with a parametric model
such as neural network
or linear regression,
local techniques
construct a local model
in the immediate
region of the query.
These models are
constructed on the fly.
The whole idea here is in
the parametric regression,
nonparametric regression is
this, the whole training,
it takes a lot of time and
experts, so what we want
to do is much simpler approach,
where you have something
like a memory or actually the
kernel regression which we use
in our work, the-- they
call it lazy training
because it really does not
require a lot of training before
but basically on the
fly it can predict.
That's-- that's the
whole concept of it.
And really the concept
came from working
on this NIJ grant was a number
of people including Dr. Wesley
Hines was really was working
on this regression modeling
from nuclear engineering.
They used it in the full
tolerance in nuclear reactors.
They want something quickly
'cause we can't train
and basically, kernel regression
was one of the methods used
in full tolerance and we
actually took it with--
with him and we applied
on the problem here
and did work very, very well.
When the query is made the
algorithm locates training input
pattern in its vicinity and
perform a weighted regression
with similar observation.
The observations are
weighted with respect
to their proximity,
to the query point.
And in order to construct
a robust local model,
one must define a
distance function
to measure what is
considered to be local
to the query implement
locally with the regression
and consider smoothing
techniques
such as regularization.
Basically here, it's a lot of
talk but the point is you have
to find-- when you're
using these models,
you have to find if--
if my distance between
the two points include
in distance, that's my measure.
That's basically, you can--
you have to find something
that the model can work on
and then basically
the model takes--
takes over from this point.
Some of the -- like I
explained nearest neighbor,
weighted average or we
take number of point and--
and you weigh them
and it was an average,
or locally weighted regression
and kernel regression.
These are the types of
nonparametric regressions.
This is example here,
if you look at this,
this is a linear
function Y equal X
and nonlinear function 4X
minus 1 over 25X squared
and basically, we're gonna
use this as an example
for the different kinds
of models that we explain
to predict actually
these shapes.
So, we're not using math here.
We're not solving mathematically
but for the taking sample
of points and see if
we feed these models
and see we can't come
with this simple--
simple mathematically
but if you're predicting,
it's not simple but-- just for
example here, we'll take sample
of points, so we'll feed the
model with sample of points
like 01530 and test the model
with a complete dataset.
So, if we try nearest
neighbor, nearest neighbor
which one of the models.
If we were to predict
the two value
of potentially noisy data point,
when we want to estimate it
with the nearest
neighbor or save date.
The nearest neighbor can be
calculated using distance
measured, as I mentioned
Euclidian distance,
and we will compare the function
using the nearest neighbor.
And this is, as I said, this
is very crude method here
for example.
The nearest neighbor
is taking the distance
between the-- the points.
Actually predicts, this is
the shape of the prediction.
You look-- the straight line
coming gonna step-- stepwise.
And it did kind of
get an idea of--
of the shape of the function.
Now weighted average, and don't
get bothered with the math
but it's basically-- the
output is weighted more
from nearby points.
So-- so instead of having
completely independent points,
distance between points,
know that this is weighted
by the neighbors with
respect to the distance.
So in this case, you are trying
to predict the Y-hat is
the prediction of the point
and you weighted with
information from--
from neighboring point.
So you get a little bit
better prediction here.
Still-- still rough but it's
getting overall the shape
of the line and kind of
a little bit rough here.
Locally weighted regression,
that's another way,
use linear regression, solve
the following linear model
where Y is the vector.
Sample of response
variable, X is the matrix
of predictive variables and
then the roles are the samples
of observation details of vector
of regression coefficients
that linearly combine the
predictor's performed response.
So again, here solve the
weighted least squares
of the regression equation for
the optimal essence of and--
a, here's some of the math
behind the weighted regression.
The important ideas that
be to had it's basically--
it's the estimates of
the coefficients itself.
So, you have the points
and then you have estimates
of the coefficients and then
you feed them into the equation
and that's basically
explaining the last curve,
how comes Y was so-- Y so noisy.
Now we're now moving to the
kernel regression which we used.
The kernel regression
is basically you have
input examples.
That's not-- not training
but as I said local training
so you have input examples, X
output examples, Y and the query
that you-- your input--
that you input point X
and then you calculate
the distance using the
Euclidian distance.
You have a kernel function,
could be a Gaussian weighted
and then use the weighted
average to predict--
to predict the output Y. This is
basically representing the curve
that we, the figure
that I showed.
And I said the most common
kernel function is the Gaussian
kernel and that's what we used.
D here is the distance,
Euclidian distance
and Y hat is the
predicted output
and the W are the weights.
So, here the difference,
now we've seen before,
when we use the kernel
regression, we had a much,
much smoother and
better-- better output.
But everything I said, it's
again, no, no free lunch here.
You have to be worried
with a kernel regression
about what we call
the bandwidth.
The band which-- bandwidth
is basically if you look
at Gaussian is the
standard deviation.
So if you increase, it's
again an optimization problem.
If you increase the
bandwidth too much,
you end up with a prediction
not following the function.
So you end up with
something like that.
If you reduce it very
much, then in-- in the--
you have another problem
of not getting good result.
So it's an optimization
on the standard deviation.
We call it the bandwidth.
>> The bandwidth should always
be chosen to be large enough
to cover the neighboring points.
That's-- that's what
we're talking about.
So we found, for example
here, in our problem
that around 2.6 this was
an acceptable bandwidth
to use with the data.
Of course you have some margin
of error but you expect that
but the error is very small
compared to the other methods.
The kernel regression,
they are divided also
into what we call the
heteroassociative model
with the number of inputs
does not equal the number
of outputs, so in the model.
In this case, you have-- you
are looking at different number
of inputs and the out
may not be the same.
The auto-- the other one which
is inferential model is a number
of outputs from the
model is one.
So basically you are looking
at certain parameter or looking
at one thing and then you have
multiple inputs, example of it.
And that's basically the math
behind what we're gonna explain
in the next lecture.
It's basically we found that the
kernel regression was superior
to the other methods was be--
being used and it did give
us better prediction of--
better prediction than other
methods like neural network
or parametric method or--
so that that's basically
the whole point.
The take home message is
that kernel regression is very
effective tool when we introduce
in a problem like predicting
the soft tissue thickness
and facial-- helping in
the facial reconstruction.
Thank you.
Any questions?
>> A little too much scary math
for people first
thing in the morning?
[ Laughter ]
>> Sorry, we tried-- we tried
to minimize the math as much
as we can but taking it
out completely will--
I didn't want it to
look like a measure box.
It's not. It's--
it's used well--
well known methods but put in
a different, different way.
I think the new thing
here is using something
like kernel regression which
is in hardcore computer vision
or nuclear engineering will
try to use it for exam--
to estimate the soft tissue.
That's the new-- the new--
or I think novel idea is--
is to get away from training
datasets and expert and neural,
which has been around
for 30 years
and use something completely and
you will see in the next lecture
on the fly for predicting
the-- the thickness.
Now.
>> I have a question for you.
Sorry to interrupt.
I know I've seen the
last couple of years
in physical anthropology,
in forensic anthropology
they're claiming,
I think in paleoanthropology
as well.
Now, there all of a sudden have
discovered neural networks.
So, even though they've
been around for--
for so long, so you
would suggest
that we do not take that route.
>> Yes. I'll tell you why.
Anthropology, by the way, I'll
give you example because my--
my-- I've been right
now 7 since 2003,
working with-- with
the guys at UT.
I've seen very sophisticated
statistics but when it's
like talking different
language, that--
for example, linear discriminant
is huge in anthropology.
A linear discriminant is A--
AB to us because we're beyond
that all our problem
is nonlinear.
So, linear discriminant,
we don't use something called
quadratic discriminant.
>> And we've used this too.
>> Yeah. And quadratic, it--
it's more in the nonlinear.
So there is around 20
years difference of--
of what basically from the
hardcore engineering methods
that being applied to all of
the application and be moving
to another-- another area
like physical anthropology.
So, neural network is a problem
that you're gonna face and I--
I think I put the
paper in the patella.
I did compare the neural
network and linear discriminant
which both of them
give good results.
The problem is, as I said,
neural networks will-- will--
and I still use sometimes
but I'm moving support
of vector machines, a lot
of tuning into the front.
I mean like the expert, you rely
a lot on the experts so, and--
and how the expert knows
which parameters coefficient,
he runs number of experiments.
So, you have to have-- and
the common wisdom and--
and neural network.
If for example, you're gonna
use neural network in training,
it's a machine learning,
we're gonna train.
We call it train--
train your model.
So you have to take for
example your experiment.
Say-- say and you have to divide
it into say 25 percent training
and 75 percent the system has
not seen the problem before.
But in order to do
something right
and get decent results then your
training data set must be huge.
Now, in some cases, in the
crania, when you have 5,
10 skulls from Hispanic or how--
in order for the number
of points we're using,
for example let-- let's
take example in your work.
If you want have all these
points that you're collecting
and you want to use neural
network and training,
you must have 10
times the number
of dataset that you have, so--
>> Right. It's just unrealistic
[simultaneous talking].
>> Unrealistic was-- was
the kind when you have--
then you have to use something
else and something else
as we call the lazy training
here kernel, that's--
fascinated by it by the way
because it does not require this
and it get-- gives decent--
decent results but--
but again you expect
some error-- errors here.
So, that's my point.
I mean they are discovering
the neural network
because some people start
talking about it and we--
we're one of the
people who grew--
use neural network with
in anthropology but I--
I think it's right
now, you should not--
you can try it but go
move to the current--
current message that being used
and computer vision and pattern
and machine learning community.
[ Pause ]
>> Any other questions?
So, to point that the kernel--
kernel regression is the method
that we use in the
prediction of--
and I'll give a small
introduction here for the next
and I will have 5 minutes.
The whole idea of kernel
regression is we are not saying
that we're gonna
use different method
in reconstructing soft tissue
to help the identification
but we're saying we're
giving multiple scenario.
So the whole idea in the kernel
regression is we tie to some--
give scenarios like the body
mass index turned to be crucial
in the soft tissue
thickness prediction.
The tables that exist, we could
say that we could even use it
as an input or actually give
not more but multiple scenarios.
So if you find a skull and you
don't have any other information
and the artist is gonna
render or gonna use--
build the soft tissues.
If you help the artist
or the forensic artist
by giving multiple
scenarios like saying, "Okay,
what happened if this person
has a certain body mass index?"
Then he uses this software to
have a different thicknesses.
So basically he can render
or he can build a model
in multiple scenario
I mean like--
like you can have
multiple scenarios.
Give more leverage in
the identification.
And we'll see this
next-- next lecture, okay?
[ Music ]
[ Applause ]
