Hello everyone, welcome back to this tensorflow tutorial series
Before today's tutorial, I have to claim that I am not an expert in CNN area, so if you find any issue, please let me know
Because not very familiar with CNN, so you can find some funny deleted scenes at the end of this video
We are going to talk about the recent achievement in neural network, named Convolutional neural network
CNN is an improvement of computer vision
It is a little bit hard to learn CNN
If you find that is too difficult for you, you can skip this one, then go to the Saver, to learn how to save and reload your networl
I don't know I will use how many tutorials to talk about CNN, because it is really a difficult topic
Google has its own tutorial about CNN, if you interested in that, you can also check it out
You can find the link in my description
If you wanna just use CNN, you'd better to watch this video to have some basic understanding
I'll only talk some basic contents, if you want to know more, just go to the link in my description
Let's look at the basic structure in CNN (screenshot of google tutorial)
This is a picture
If it's a colored picture, it will have RGB in every pixel
RGB (red, green blue)
Using RGB to show lots of different colors
So this RGB becomes the thickness of this pic
Besides the 256x256 height and width, it also has the thickness (3, RGB)
It has RGB values for every pixel
The CNN wanna to compress my length and width of the pic, but increase the thickness of my pic
decrease length and width and increase hight or thickness
decrease here and increase here
At the end of this procedure, it could become a classifier
Like the tutorial before, the output is a array like [0,0,1,0,0,0...]
To sum up, the CNN is going to gradually compress the hight, width of the pic
And increase the thickness
At the end of this, it turns to a classifier
This is also from google's tutorial
Look at this, it has RGB three colors
You can think about it like this way
here is a patch or kernel, I detach a small size in this big pic
This patch has it own length and width
We detach it out to analyse it
So the output would become a higher patch but smaller area
The stride parameter is to determine how many step or pixels to detach another patch
For example, if the stride=1, I detach one patch for acrossing every 1 pixel
if the stride=2, I detach it across every 2 pixels
We will use these two parameters, so you have to know this
This is for stride=1
This is stride=2
So this is how the hight and width be compressed
This is what CNN did
So it becomes smaller but higher cube
It contains all information in the pic
We will have two padding methods
One called: Valid padding, this padding make the pic smaller
The"Same padding" shows the same width and hight as the pic
Besides padding, there is another step called pooling
The pooling is to, for example
if I want to have stride=2, by doing this way , the information density may losed
so the information in original pic may destroyed because the stride is too long
To handle this issue, we add one more step call pooling
We keep stride of 1, keep more information in pic
Then using pooling to decrease the size
The pooling methods we have now in tensorflow are max pooling and average pooling
Let's sum up the structure of CNN
If show one image
Then pass it into convolutional layer
Then max pooling it
Convolution + pooling can save more valuable information from original image
Then add another convolution and max pooling
The fully connected is like the normal layer we add before
and another hidden layer
and lastly it becomes a classifier
I'll show you the details in this demo
This A image may contain 3 thickness
Then we start to compress and increase the thickness
Like this, it increased the thickness
And the width and hight are continuously decreasing
Then conect these to a fully connected layer
Then to use the classifier
This is the whole structure about CNN
Hope can have a better understanding about this CNN
I recommend you try the normal network structure first. If that is OK for your project, then you don't need to use any CNN
Because CNN is not that easy
Next tutorial we are going to code it
In fact, I spend lots of time to record this...
Now there are some deleted senses
