Hello, and welcome!
In this video, we'll provide an overview of
Recursive Neural Tensor Networks, as well
as the natural language processing problems
that they're able to solve.
Sentiment Analysis is the task of identifying
and extracting subjective information, like
emotion or opinion, from a source material.
For example, this might involve analyzing
a twitter feed to determine which tweets express
a positive feeling, which express a negative
feeling, and which are neutral.
In order to classify sentences into different
sentiment classes, we'll need a dataset to
use for training.
One potential dataset is the Stanford Sentiment
Treebank.
Each data point is the syntax tree of a rotten
tomatoes review.
The tree itself and all the subtrees are labeled
with a sentiment value from 1 to 25.
25 is the best possible review, while 1 is
the worst.
The dataset was created by Stanford researchers,
who utilized Amazon's Mechanical Turk platform
in order to assign values.
Recursive neural models can be used for the
sentiment analysis problem.
These types of models are characterized by
their use of vector representations.
Vectors are used to represent words, as well
as all sub-sentences related to an input's
syntax tree.
The word representations are trained with
the model, and the representations of sub-sentences
are calculated with a compositionality function.
To calculate the sub-sentence's representations,
we apply the compositionality function bottom-up
according to the input's parse tree.
All vectors are fed to the same softmax classifier
to determine the sentiment.
The choice of compositionality function is
important, so we'll present three different
types of recursive models, each with a different
function.
The first model we'll look at is the basic
Recursive Neural Network.
To compute our word composition, we start
with our vectors that we want to combine,
which we'll call "b" and "c".
We form a "two d" by "d"
matrix by concatenating "b" and "c".
This new matrix is multiplied by the "d" by
"two d"
weight matrix "W".
"W" is the model's main training parameter.
Then a nonlinearity is applied element-wise
to the resulting vector.
In this case, the nonlinearity is the hyperbolic
tangent function.
As a brief note, we?ve omitted the bias for
simplicity.
Other models use this compositionality function,
like the recursive autoencoder, and recursive
auto-associative memories.
As you can see, the words only interact implicitly
through the nonlinearity, so the compositionality
function may not be consistent with linguistic
principles.
The model also ignores reconstruction loss,
since the dataset is large enough to compensate.
Now let's move on to Matrix-Vector Recursive
Neural Networks.
This type of model is a linguistically-motivated
improvement over the basic recursive neural
network.
The big change is that now every word is represented
by both a vector and a "d" by "d" matrix.
The compositionality function that you see
here takes four objects.
Lowercase "b" and "c" are the word vectors,
while the uppercase "b" and "c" are the respective matrices.
Lowercase "p1" is the resulting vector, while
uppercase "P1" is the respective matrix.
Just like with basic recursive neural networks,
a matrix "W" is multiplied with a matrix created
from the word's representations.
But in this case, the matrix created is much
more dependent on the relationship between
the two input words.
The problem with this model is that the number
of trainable parameters becomes too large
as the vocabulary size increases.
The Recursive Neural Tensor Network, or RNTN,
uses a powerful fixed-size compositionality
function that only takes the word's vectors
as arguments.
The model is not parameterized by matrices
but it adds a "two d" by "d" by "d"
tensor that is used in the function.
This tensor is also trained with the model.
Each of the "d"
slices captures a different type of composition,
so intuitively, it is more capable of learning
than the basic recursive neural network.
It turns out that RNTNs outperform the known
alternative methods.
It has achieved over eighty-seven percent
accuracy in positive negative word classification,
and over eighty-five percent accuracy in positive
negative sentence classification on the Stanford
Sentiment Treebank.
This is a sentence classification accuracy
that's more than three percent higher compared
to normal Recurrent Networks.
Recursive Neural Tensor Networks can also
be used in other applications, such as Parsing
Natural scenes, and Parsing Natural languages.
This is due to the recursive nature of these
problems.
If you're interested in learning more about
RNTNs, we recommend you follow the link here
to a great article by Socher, and others.
By now, you should understand the intuition
behind recursive neural models, and recursive
neural tensor networks.
Thank you for watching this video.
