Another example use of recurring neural networks
is in text generation.
So we saw earlier the example where we teach
a neural network to speak like Shakespeare.
And something interesting about that network
is that it managed to do that by taking input
at the character level and of the word level.
So you can, for example, feed individual letters
into the network, and then the recurrent layer
keeps a memory of the letters that came before
the current letter, and then it outputs a
sequence of what it believes should come next.
Obviously this technique can be used in other
contexts.
It can be used to create chat bots.
It can also be used to create, for example,
auto correct algorithms, et cetera.
The next type of neural network we're going
talk about is the convolutional neural networks.
And convolution is one of the most important
advancements in the history of machine learning.
It's the main reason that deep neural networks
manage to achieve so great performance in
computer vision tasks.
It's impossible to create a neural network
for computer vision without using convolution.
It's a simple but super effective idea.
And what's the idea behind it?
So, when you look in an image, not every single
pixel is important.
So you have an image and you have thousands
and thousands of pixels.
And most likely we just took, let's say, a
sample of the pixels in an area, then you
could probably get the same kind of information.
So this could allow you effectively to, basically
to steal, use information from the image.
But at the same time, let's say compress it,
you know?
And ignore noise simultaneously.
So what convolution in neural networks do,
is they apply a filter of some sort basically,
that works as a sliding window starting let's
say in the upper left corner going down right.
And this filter, let's say, which can be basically
square, it takes chunks of pixels, let's say
nine pixels at a time.
And then it either averages the pixels or
it chooses only one of those.
And then this process is repeated again and
again.
And this helps the neural network ignore noise
and utilise feature locality.
So it helps the network in essence in intuitive
terms become less confused.
Because for example, if you look at the graph
in this picture with a boat, you know that
one pixel is most likely going to be similar
to the other pixel, right?
So what's important is that, you know, there's
green in this point.
The neural network does not really need to
look at each one of the nine or 20 or 30 pixels
in a square to understand that there is grass
there.
It can just look at one, figure out that yeah
it seems that there's grass in this location
in this image.
And this actually helps neural networks improve
performance, they train much faster.
And it also helps them extract these higher
level features which we saw earlier.
As I mentioned, in the previous video, the
reason that the neural networks work so well
is because they can come up with these visual
hierarchies automatically, no their own.
And convolution, in computer vision, is one
way to do that much more efficiently.
So, in image classification as it stands,
as well as video, anything relating basically
to images and video, is probably the most
complicated task in machine learning right
now.
And now we have all these architectures that
are using convolution and pooling in order
to solve image classification problems.
So pooling basically refers to the technique
that once you've gone through the convolution,
pooling is the method which you're going to
use to extract information from this window,
from the square, let's say, we saw in the
previous image.
And these are some architectures that have
been used in order to classify images.
You have the VCG, the inception model, the
Alexnet.
And as you can see, the common thing between
them is that first they're very complicated.
They're very deep, in essence that they have
many layers.
And now we have some architectures which have
even more than 100 layers, and they're all
using convolution.
Another important application of deep learning
is in reinforcement learning.
So, Deep Mind, which is now acquired by Google
so it's Google Deep Mind, an AI company based
in London, came up with this idea of deep
Q learning.
So Q learning is an algorithm used for reinforcement
learning.
And the heading said to combine this with
deep neural networks.
So why is this important?
It's important because they managed to make
algorithms that would play video games, but
without really getting any other input other
than the raw input from the image.
So playing video games is not a big deal,
you know, creating an AI to play video games.
However, creating an AI that can get the raw
input from the video and understand, extract
the higher level features, understand that
hey this is the character, you know, and I'm
actually handling this character, I can jump
up or down et cetera, is an incredibly difficult
task.
And it actually managed to do this.
And then some evolutions of this work is networks
that can play against each other games.
And then they get better and better.
So this is why, I'm not sure if you are following
the story behind Alpha Go, but Google Deep
Mind actually managed to create a network
that managed to beat some of the best professional
players in Go.
Go is basically a tabletop game which is way
more complicated than chess, and was believed
that it would have been impossible for AI
to track it.
Another very important technique in neural
networks is generative adversarial networks.
So, this is a technique where you have two
networks, again defined against each other.
So you have a network, as it tries to create
fake data, and another network that tries
to understand whether the data is real or
fake.
And this is a very powerful technique that's
being used in order to generate images.
And now let's all of this together and see
some actual use cases.
Let's see where deep learning is being used
in practice.
So this is a very cool application of deep
learning, where many of the techniques that
we saw earlier are being used in conjunction.
So, image captioning, as well as video captioning.
So you have, the input here is an image.
Then, this image is fed into a convolutional
neural network, which reads the image.
And then in order to produce the caption,
we have a recurrent neural network that also
has an attention mechanism that uses long
short term memory, in order to create a sequence
of words which describe the caption.
Pretty cool application, and you can see how
all of these different elements of neural
networks can fit together in this application.
Another very cool application is style transfer.
So in style transfer, you can combine an image,
which you call the content image, and then
a style image, and you can transfer this style
from the style image to the content image.
So for example you can recreate images in
the style of famous painters, et cetera.
So this is a very cool application, and in
this example you take a trained VCG network,
a network which was created for image classification,
and then you use this pre-trained network
in order to read both the style image and
the content image and then this network transfers
some of the features from the style image
to the content image.
Another very cool and advanced application
of deep learning.
And in case we didn't realize it, and I mean
now you probably realized it looking at this
image, is that deep neural networks are complicated,
but they're also very modular.
So all the new applications that come up,
they are using all these neurals and all these
layers we say as modules effectively, that
are combined with its other.
So this is another interesting technique,
images colorization.
And you can see for example how this is achieved
by having two networks in parallel with some
of their way and then they're fused towards
the end.
Also image generation, so image generation
is produced through generative adversarial
networks.
So you saw earlier how GANs basically is a
technique that uses two different networks,
one to generate fake data and another to discriminate
between real and fake data.
So this is like a game effectively, where
the generator network is going to win if it
tricks the discriminator network.
So the perfect outcome for the generator network
is to produce fake data then trick the discriminator
network that all of these data was real.
The discriminator is going to win the game
if it correctly guesses which data is fake.
And you can see here, actually this is from
a paper, where on the left hand side you have
real images, you can a van, you can see a
dog, you can see some animals, some other
vehicles, et cetera.
And on the right hand side, you have generated
images.
Which are not perfect, but at first glance,
I mean they might look a bit weird some of
them, but at the same time they look kinda
real.
So you can see how powerful GANs can be.
And GANs are very important, not only because
we can generate fake data.
I mean okay, that's a pretty cool application,
but are you actually going to use it?
But they can be used to translate any kind
of image from any other kind of image.
So this technique allows us to take a photo
and translate it from day to night.
It allow us to colorize a photo.
Allow us, for example, to change the kind
of map from aerial to a standard map.
It's one of the most incredible advancements
that have happened in deep learning.
And obviously this is just some example applications.
I'm sure that we're going to see more and
more of them to come in the next few years.
So in terms of tools for deep learning, I
mean this presentation comes to a close.
There are many different libraries out there
for deep learning.
There are open source tools that are commoditized
tools.
So if you you're a coder yourself and you
want to start working in deep learning, obviously
it's good if you have like the required math
background.
But because many of the things have been standardized,
there are many examples out there, you can
start playing around with some models without
fully understanding all the intricacies behind
them.
So, I'd say these are the most popular libraries,
because there are many more out there.
I'd say the most popular ones are Tensorflow,
PyTorch, Keras, and Caffe.
So Tensorflow is supported by Google, it was
created by Google.
And it's basically created so that it's very
modular.
So you can combine different types of neurals
and layers with its other and create new networks.
PyTorch is open source.
It's supported by many big companies, Facebook
being the most prominent of them, and NVIDIA
and Twitter.
And it's very useful for research.
It's a framework that's very popular for research
or where you want to create something really
custom.
However if it's the first time you're dealing
with neural networks or machine learning,
I'm not sure if it's a good idea to go with
PyTorch.
What I would go for would be Keras.
So Keras is a high-level interface for Tensorflow
in Python, and it's very easy to use.
So if you followed any machine learning tutorials
in Python on things such as scikit-learn,
Keras is basically very, very similar.
So, if you just know how to code Python, you
can probably start coding a deep neural network
in Keras within, you know, 30 to 60 minutes.
So it's a pretty cool library, check it out.
And finally you have Caffe, which was developed
by Berkeley University's AI Research Lab.
And it's very good library for computer vision.
It's actually a framework.
If computer vision is your thing, then it's
good to try it out.
However my advice is that, if you know the
coder yourself, if let's say you're a CEO
and manager, product manager, a startup fund
there, the first thing I would do if I was
Eager Susan, I was going to use deep learning,
it would be to try one of the commoditized
services.
So, all the tech giants, Google and IBM and
Azure, they have their own commoditized services
for AI and machine learning.
And this includes deep learning functionalities
as well in things such as computer vision,
natural language processing, et cetera.
So why would I suggest this?
Well the reason is that deep learning requires
lots of data to get trained.
And it also requires loads of hyper parameter
tuning.
And this can take a long.
And it can also take many resources to properly
train a deep neural network.
And let's face it, like domain, such as computer
vision, getting the right amount of images
is probably going to be very difficult for
a new company.
So it's best to try and build a proof of concept
based on a commoditized service.
So wrapping, so when do you use deep learning?
Let's say that, as I mentioned, you're a startup
fund or you work in business development and
you're curious as to whether deep learning
can actually help you and your business.
So, deep learning, as you must have realized,
has many cool applications, but it's not really
the solution for all problems.
So if you have issues relating to images,
audio, video, then it's the go-to solution
for sure.
Natural language processing, it's also very
good for natural language processing.
But sometimes you might be able to get away
with a simpler model, such standard machine
learning models, like a combination of bag-of-words
and naïve Bayes.
The reason's that, as I mentioned earlier,
deep learning requires lots of data, lots
of processing power.
This can take a long time to train.
Plus, you have to hire someone, and people
who have real experience in deep learning,
they're not that cheap.
So it's a good idea to try and first and foremost
see whether you can actually use some more
stand techniques to solve your problem.
And if that's not the case, and you move on
to deep learning, maybe it's worth taking
a look at some of the commoditized services
that can help you build a proof of concept
pretty fast.
The problem with commoditized services is
that whereas they're easy to start using,
so you just connect to an API, you upload
your image, you get back a classification
classification, et cetera, they have a few
different issues.
The first one is that they might still not
cover ever case, so it might not be the right
solution for your case.
Secondly, performance, even if they can cover
your case, let's say face detection, performance
might not be as good as you wanted, because
many of these solutions are created as more
like general solutions, let's say.
That all make sense?
So they're not necessarily specialized to
your particular domain, which could be, I
don't know, detection of particular objects
or translation of some languages which are
not popular, et cetera.
And finally, whereas it can be cheap initially
to use a commoditized service, in the long
run it's going to cost you more money.
Whereas developing your own solution, it's
going to cost you loads of money in the beginning
but once you do that, the solution thing is
there, you don't have to spend any more money.
So I would suggest that a good strategy to
try and follow in this case is to maybe start
with a commoditized service for the proof
of concept, see how well this works.
Then think whether it would make sense to
maybe invest a significant amount of money,
probably in the tens of thousands of dollars,
euros, or pounds, in order to create your
own internal solution.
So, that was it from me.
I hope you enjoyed this video, and make sure
to follow the channel of Tesseract Academy
on YouTube.
Thank you.
