A s AI becomes more powerful it's likely
to solve some of humanity's most
difficult challenges because we're too
slow in processing we don't have the
massive scales of data or just areas
where we shouldn't venture, but what if
AI's goals are different let's talk AI
alignment. The reason that we should care
about artificial intelligence value
alignment is pretty simple the greatest
chance at a bright future of coexisting
with AI whose intelligence is growing at
an exponential rate is if our values are
aligned value alignment in artificial
intelligence is where the AI attempts to
do what we would want it to do and this
what we wanted to do may be slightly
different from how the AI is trained
stay tuned later for some wild examples
where this is not aligned ultimately
intelligent machines are trained to seek
and optimize a given value so designing
the reward or utility function or values
are obviously important if you combine
misalignment of values with a super
intelligent machine that's very capable
then you have a really serious problem
for the human race so the point is that
machines can and will make better
decisions than humans but only if their
values are aligned with those of the
human race I love some of these
alignment problems the first one that
you'll probably know is King Midas as
Midas found joy and wealth in the power
that he could touch and turn anything to
gold
he soon beheld his food grow rigid and
his drink hardened into gold ice then he
understood the value of alignment
problem King Midas's example may sound
far-fetched and trivial but we're
already seeing countless examples of
this and AI already some hilarious some
just catastrophicly bad in my previous
video of AI learning to play
hide-and-seek one discovered that it
could get on top of a block and surf it
over the barricades to seek the hiders
this ability is not envisioned by the
programmers but that didn't stop the AI
from optimising through that loophole
check out this genius robot this AI was
tasked to learn how to walk with a
minimum amount of contact between its
feet in the ground yep walking on your
back with your feet in the air is one
solution I guess this isn't cheating per se
but you can obviously tell that this
is not what the programmers envisioned
or we're seeking this example is one of
my favorites this is a simple game the
blue humanoid is trying to get across
the line and the red humanoid is trying
to stop it they seem pretty evenly
matched until the red AI discovers a
novel technique of just collapsing on
itself and playing dead and surprisingly
this yields outstanding results in no
real-world scenario with this tactic
work but the AI discovers optimal
strategy sometimes AI behaves in very
interesting ways even when the human
programmers think the incentives and
goals are very well aligned this my
friends is AI alignment or value
alignment how do we ensure that AI will
make decisions in line with what we want
instead of perhaps the easiest way for
it to train moreover how do we control
something smarter than ourselves
you can't check every decision that a AI
is making and even if you could you
probably wouldn't understand them but
you may be able to check the core set of
principles that the AI relies on and
make sure they're in line with our own
one famous example of these core
principles are Asimov's law of robotics
these laws prevent the robot from
injuring a person or itself but what if
a robot needs to make a difficult
decision that humans have spent many
thousands of years attempting to grasp
for example autonomous cars are
barreling down the road at 100 miles an
hour when one car sees four people
standing directly in its path it can
swerve likely destroying itself and
potentially the driver in it (you) or
maybe it ought to maintain its course
while prioritizing its life and the
human occupant. Is one life worth more
than those
or lives? Is there a difference between
passively taking life and making a
decision to actively take life? What if
four people in the road were breaking
the law by jaywalking in the first place?
What if those four people were 85 years
old? this dilemma is commonly referred to as the trolley problem and it begs some
very interesting questions if we can't
even align humanity on what one ought do
how do we propose aligning robots check
out the moral machine dot net for
examples of this self-driving trolley
problem it's pretty crazy there are a
lot of wicked smart thinkers working on
value alignment right now and this is a
wicked problem odds are some future
danger of AI doesn't come from some
Terminator style robot that wants to
exterminate humans it's more likely
they're just trying to optimize whatever
we originally asked them to do perhaps
they take it just a little bit too
literally
just like King Midas that's why we'd
rather have them optimized for what we
want them to do and if we get value
alignment right their robot may even
want us to turn them off in dangerous
cases here's two examples. In example
#1 Mickey is training an AI
broomstick to carry buckets of water and
then dump them in a cauldron form him this scenario goes horribly awry when Mickey
is drowning in the water that they
continue to dump in the cauldron even
after the broomsticks are under water
these broomsticks are doing a great job
at maximizing the reward function, not to
Mickey's benefit. Example #2
Mickey may generally train the
broomsticks to help him do chores they
helped him sweep carry water whatever
Mickey WANTS them to do moreover and
this may be even more important than
screwing up Mickey's chores
if the broomsticks did or was going to
do something wrong broom stick #1 would
actively fight Mickey if he was trying
to turn off the burn stick if Mickey
turned off broomstick #1 he
wouldn't be able to keep filling that
cauldron and optimize his reward an
example #2
If Mickey turns off the broomstick
it's because he doesn't want it to do
something dangerous if Mickey went to
turn the broomstick off because it was
doing something wrong or dangerous it
would know that this is aligned with its
value the broomstick is trying to do
what Mickey wants it to do and therefore
it may even stop knowing that Mickey's
trying to turn it off because it knows
that it's doing something wrong or about
to do something wrong that's AI
alignment! there are obvious challenges
and issues in AI alignment, and hopefully you
have somewhat of an overview now this is applicable now and not just in the
remote future when you are analyzing AI
behavior think about what it is trained
to do was it trained to solve some
complex puzzle or was it taught to
maximize a score if it was taught to
maximize that score it probably found
some way to optimize for that score
that's a little bit different than
actually solving the puzzle this isn't
cheating this is optimizing a reward and
that's the way we train most of the advanced
AIs today. If you're still here that
means you like these videos definitely
hit the thumbs up button, hit that like,
and subscribe so you can see more
awesome content like this. This is a
difficult problem and very important to
solve it will definitely be interesting
to see how this work and research go in
the future
this is Jerry... SEEYA!
