The Holy Grail of artificial intelligence
is to be able to point a machine at a big
pile of data and have it figure out everything
that's going on and all the relationships
between all the different elements with no
human involvement at all. We are clearly a
long way away from that. Instead we use pretty
orderly data sets in order to train our artificial
intelligences. Computers are pretty good at
recognizing handwriting because there's a
large corpus of handwritten documents that
have been transcribed. Similarly machines
are good at translating from one language
to another because there’s a lot of documents
that have been translated already. The recent
advances we've made in being able to identify
images is also because recently there's been
a data set of labeled images on which artificial
intelligence can train. Video, however, presents
new challenges, mainly because, although there's
an armload of video publicly available on
YouTube and Facebook and places like that,
it is poorly tagged - especially the action
that is going on inside of the video. That's
something that Google DeepMind is starting
to tackle. They took a bunch of video, and
they put it in Amazon Turk and got people
to label 400 different activities that are
occurring in those videos like playing tennis.
They're using that to train an artificial
intelligence. In a way this is a really interesting
problem because every frame of a video is
in essence a prediction of a place for a computer
to make a prediction about the next frame.
And so I think we can expect computers to
make a big advance in this even though it
is seemingly an incredibly difficult area
in which to train artificial intelligence.
If you're interested in artificial intelligence
visit GigaOM.com or check out my new book
"The Fourth Age: Smart Robots, Conscious Computers,
and the Future of Humanity".
