What's going on everybody! And welcome to part 9 of our self driving car with Python and it just so happens to be Grand Theft Auto 5 tutorial series.
Where I left off I was gonna explain the next steps we are going to take since some people might have skipped those videos 'cause they don't like to listen to me talk.
Just make sure you clone the directory that basic like most
likely if you're watching this right
away you might be able to clone this
main directory... well, you could clone the main directory
always but probably what you
need is actually going to be in Tutorial Codes by then
but make sure you just
have the main.py grabscreen,
draw_lanes, directkeys and all that and
main.py should look like this it
should have the "roi" and the process_img  (process image)
and all that stuff. Once you have
that you're ready to rumble. I've got it
already put into into here. So basically
I'm going to first start by pulling up
main.py. Basically like what
we're going to be trying to do here is
training a neural network. Now in the
neural network we're going to be working
with image data so immediately that
should probably signal to you we're going
to be using a convolutional neural
network. If you're not familiar with deep
learning Python you can come to the deep
learning tutorial series there's part of
this huge machine learning series but
you can also if you are already familiar
with machine learning kind of just jump
into the the deep learning part which is
somewhere ...
...there...
okay and this is with the first we start with tensorflow and
then we eventually get into TF learn
which is kind of towards the end which
is here. So anyway if you need to learn
more about all of these things there's
plenty of information on my website so
sometimes people get angry when I'm like
copy and pasting code but I don't really
see the point of like reteaching
something when I've already taught
exactly that thing so I just don't like
to waste time so anyway ... ahm... yeah so what
will happen is that you start with is just
this kind of main.py here and we
can actually probably clean this one up
quite a bit but the first thing that we
need to do or like the first step if
we're going to have a self-driving car
that's going to learn from a neural
network as we need training data. So the
input data to the neural network whether
we're training or we're actually using
the neural network input data is going
to be the frames, right it's going to be
the pixel data. With if it's a
convolutional neural network we could
have other input data too but those
that's what we are going to do so want
to throw in the pixel data and then the
output should be the actions. It should
be what we hope the vehicle to do. So to
train it initially there's a few things
we could do. We could use the previous
code that sort of did okay but that
probably not work the best I mean we
could start with that I suppose but
instead I think probably the best thing
to do would be to manually drive the
scooter a bunch and teach the neural
network this is how it's done. So that's
kind of what I want to the way I'm going
to do it. Now the problem is we need some
way to record the keys that we press and
I didn't know of any way to do that in
Python without actually having the
Python window that was tracking the keys
and focus so I took to Twitter asked
around and a "Box of hats" answer to me
and provided some codes so the first
thing we're going to do is we're going
to make a new script and this one will
be in the github so actually probably
when you clone the github you probably
already have this
on but I'm going to make it because i
don't have it up yet but actually when
you do "climb it"? unless I make mistake
this will be there so anyways that's
this script on let me make it bigger-ish.
Let's do 16 and this is not the code
when I'm just going to copy and paste in
the code. Okay. So this is get_keys.py
we're going to be using again pyWin32
and in the last video I didn't specify
what to do to get that. So it my not totally
be clear what you should download so
what you want is pyWin32 and where you
get it is ... am I blind ... here we go the
unofficial windows binaries for Python
this is probably the best place to go to
get it pyWin32 whoops i clicked on the
wrong freaking thing. That there we go
we're still here. All right. Oh my
goodness, back, okay, oh there we go, sorry,
now download the wheel it applies to you
so python 3.6 amd64 that's 64-bit python
so make sure you get the one that
applies to you. Anyway this is actually a
pretty simple script I'm embarrassed
that I couldn't figure it out but again
this is another thing I tried to figure
out. I just couldn't do it. But anyway
yeah, so this just goes through all the
characters that one could plausibly
press implements them to the key list
and then the key_check function just
simply checks your state and sees what
keys are being pressed. I have no idea
why, I must just over copied, anyway um so
now we can do is we can import, you know,
from get_keys import_key check and now
we can get the key presses. Cool! I'm
going to close out of that. There's get_keys
and so yeah so let's go ahead and
do that from get_keys import key_press.
All right! Now.
What we're going to do is basically, we need
to have a function that will convert the
pressed keys to a one hot array again
this is a pretty fundamental thing to
neural networks so if you're not
familiar you can check out the "neurons"
of the tutorial series there and in fact
let me just
make this big. Configure IDLE and ... 18 ... cool.
So now what we're going to do or what I'm
going to do is define keys_to_output and
then we're just going to say "keys".
We're going to pass in the keys that are
currently pressed and then basically
what we want to do is convert this to one
array basically that will be the keys
of the following keys so it'll be either
an "A". I have CAPS LOCK? A,W & D so we're
not really going to worry about slowing
down for now. It's an unnecessary
move so basically AWD so this is take
a left just go straight take a right
okay and then later on we can kind of so
first of all we are obviously ignoring "A"
and "W", go straight and turn because like
I said we're just going to assume we're
always going to be going straight well
of course in reality you'd also have an
S you'd have "SA" you'd have probably not
SW but you could if you wanted to do a
burnout, you could have a "WD" you could
have "DS" and so on so there's a lot of
other combinations but for now we're
going to keep it super simple because
again right now I'm just trying to see
mostly like it's just even going to work
you know so so right now we're just kind
of testing things so we're going to
start with output it's not a constant is
going to change 0,0,0 so right now we have
nothing and if we have a 1 here that
means, it's a boolean value, right, it's "A"
we're going we're going to take a left
if we had 1 here it'd be forward and
so on. So now what we're going to say is
based it's just a bunch of if statements.
So if capital A is in keys then we're
going to say output 0'th  equals 1 so
this will now be a 1 and then we're just
going to copy this and then just do a "if else"
so L if (elif) elif B is in keys that
would be 0 1 2. So 2 equals 1. Else
if else actually we don't need an if
statement at all we're just going to say
else output 1 equals 1 now this is
crucial make sure you've got this set up
correctly make sure you've got 0 1 into
accounted for in this exact order if you
mess this up you're going to really be
kicking yourself down the line. Finally
what we're going to say is returned
output. And i'm just going to confirm
because i don't want to mess this up
zero if which is AWD so 0 1 2 which is
take a right one, 1 forward, good, I think that looks
good so we return the output and we're
good to go. Now what we're going to do
here is we don't need to roi we don't
need process_image we don't even need
these turns we can use the countdown so
I just deleted everything basically we
just have keys_to_output. We've got this
like sleep which is kind of poorly
placed let me just move it over into
here, like of so hopefully I'm not
screwing up something. All right ... now
before we enter the main function let's
go ahead and say file_name equals
training_data.npy and
then we're going to say ... and I
need OS "import OS" also let me see here
we don't need we don't need_direct keys
or draw_lanes part of me wants to get
rid of pyautogui and someone pointed
out that might just my importing of
pyautogui was slowing things down and
in fact if that's in the github, bummed, I
was intentionally trying to leave out
the import pyautogui because I
think I might be the only person
actually needs it but no it's there. Darn
it! Since I'm thinking about it i'm just
going to go ahead and edit that real
quick. Not necessary, okay, so now, as I
was saying I'm gonna leave you there
just so it stays. Actually, no! I don't need that
we're not even we're not going to put up
a new screen we're actually totally done
putting up new opencv screens probably for a
long time so yeah we definitely don't
need it so that will help frame rate a
lot. And in fact let me get rid of this and
we're not going to be needing this and
we don't need any image_show we don't
need to process image we just need to
grab the screen and then like we can
look at the times so just to kind of
show you how quickly we can process now
let me run this also to give me a kind
of confirmation that we don't have typos
can i import name key_press really do "you
want to fight!" let's see grab_screen ...
grab_screen Oh have I called it literally
grab_screen wait what a key_press oh no
we're okay hooh, dude, anyway, wrong thing
from getkeys import key_press
getkeys .... key_check
key_check and we haven't used that actually
yet anyways. Now that was a big mistake, i don't know, I'm
tripping out right now anyway okay so
the countdown has begun yeah Wow check
this out that's how fast we're
processing frames now like we've got a
really good frame rate so anyway and a
lot of that is because we're not showing
the opencv frames but a lot of it's
actually also just because of using
grab_screen instead so that made a huge huge
huge difference with pyWin32 that is
anyway so we are importing OS now back
to what we were doing anyway having some
serious brain farts. This is this probably
doesn't end well. Anyway, if OS.path.
.isfile let's check for the file_name so
let's just see if it exists because:
Here's the deal guys we're going to need
to make. Uh, almost knocked over my water and
we're going to need to make a lot of
training data and chances are you are going to
get really bored of doing this it's not
fun it's not fun to drive a scooter in
between lanes. It's just not enjoyable so
but we really needed we need to do this
first before we do anything serious i
think because we want to check we want
to see can this actually be done so
anyway what's nice in the way that I've
kind of coded things is or that i'm
going to code things is we're going to
see if this file exists if it does we'll
load that data and we'll just keep
appending to it and then every let's say
so many iterations will save that data
that way we can kind of come back, come
and go as we please so yeah so if that's
if that file exists great let's just say
print file exists loading previous data
else print "File does not exist,
starting fresh" and then also I guess we
should probably define this so let's say
training data equals the list version of
np.load(file_name) I want to do this so
I can append to it. I'm pretty bad at
appending to numPy array right I just don't
know the proper protocol. I don't think...
you can just you can't just do
.append I don't think. Anyway, training
data equals that. If someone wants to
improve that because it'll be faster
just to just straight-up numpy pull
request, ok, so, or just let me know and I'll
throw it in there I'm like half the people
that have improved the code haven't done
it officially they just sent me a
gist or something anyway. So. Ok. So now
we've got the train data now we've got
the we got a countdown that's fine now
basically what we want to do is every
time we grab the screen first of all we
need to... we grab the screen we've got the
region we want to convert the screen to
grayscale we could keep it in color but
you just if we can avoid the color thing
it would be nice because think of it
this way can you play Grand Theft Auto 5
in grayscale yeah so if you could
do it then the neural network should figure
it out right that's the kind of whole
point so and if we can trim down that
data because like gray scale is
one-third the size of fully RGB data so
we want to scale it down if we can so we
are going to do that so we're gonna say
now screen equals cv2.convertColor
we're going to convert the color of our
screen and we're going to do
cv2. ("all caps") COLOR_BGRTOGRAY
now we also finally want to
resize this to start let's keep it nice
and small we're just going to say screen
so cv2.resize let's do
the screen and we'll do 80 x 60
eventually I wouldn't mind getting that
a little bigger but I think for now
just seeing the lanes 80*60 should be
more than enough feel free to visualize
that yourself the 80 x 60 like when we
save the data we can pull it up and look
at it i'll probably forget to show you
but you can do that and you should be
able to see that okay i could see how
you could play with that size like you
might want to blow up an 80 x 60 it but
it'd be you know truly of that
resolution but you can still stay within
lanes at 80 x 60 also for a neural
network a convolutional neural network
80 x 60 is actually kind of large so
anyway so yeah so we've resized and then
now let's go ahead and check for the key
so i'm going to say keys equals key
check and they're going to save the
output for the actual training output
that we want we're going to say keys_to_output
and that'll be the keys and then
finally we're going to just do
training_data.append(screen,
output) and that's it so now what we want
to do is we will say so this is just
basically appending constantly sorry if
you could hear the jets flying over my
house by the way I called the Navy and I
told them that would be nice if they
would just like you know give me a
warning every time they fly over but
they don't don't seem to care. Anyway
what I'm going to do now is, i suppose
to Air Force the Air Force and navy are
buddy but..., anyway moving along. If
the len() training_data modulo (%) 500 equals (==)
zero so if you were to divide by five
hundred with the remainder be zero if
that's the case let's go ahead and print
len(training_data) and then let's go
ahead and np.save(file_name, training_data)
so save to the file name
say what training data, I mean all right
that should do it for creating training
data let's go let's try let's see how it
does so hopefully it'll get some files
up in there every 500 a to let us know
that we've done something also let me
make sure yet we're in first person view
will have the countdown and then
basically if you want to stop at any
time later on i'll add a fancy way we
could just like pause or something got
to keep that YouTube add revenue so what
we'll do is like for now you can just
kind of tap out and just control C so
right after you get a 500 frames update
go ahead and control C or tab out
control C and that will stop the script
and you won't have any screwed up frames
but later I'll add some way to pause or
something probably but anyway for now
again we're just so far from doing any
official testing but anyway four three
two one go got fricking frames of the
wazoo now for this so I'm just going to
try to stay kind of in the middle of the
screen as best as possible this scooter
is fast I think I might have a speed mod
or something on the scooter anyway so
yeah just keep doing this we prize
should get rid of the whole frame took
it's really hard to see when we get an
update next time I see one I'm going to
pause it there you go okay control tab
out control C please stop okay control
so you didn't work but that's fine I
pretty sure I stopped it in time so I'll
save that and sure enough here's our
training data here so we actually have
some stuff in there and stuff so anyway
um what I suggest you do is do this till
you have you know ideally a hundred
thousand that I mean actually ideally a
million
but probably about a hundred thousand
frames because the other thing that
we're going to be talking about in the
next tutorial is balancing this data
because it's going to be mostly go
straight and if we send that through a
neural network the neural network is
going to really quickly find out go
straight except in all these outliers
circumstances and it's going to fit to
that it's going to overfit to that and
now when you go to actually use it your
neural network is going to be like
always straight touch that's going to do
to you so we have to balance the data so
anyways that's all we're going to do in
the next tutorial if you're having any
problems or whatever up to this point on
will also also try in the next tutorial
to look at this data just to show you
that yes the state is working here it is
it's beautiful but yeah questions
comments concerns issues whatever feel
free to leave them below otherwise I
will see you in the next tutorial
