Hello.
My name is Jordan Santell from Google daydream
WebXR team.
I'm here to share about our augmented reality
on the world.
We're exploring how AR and VR will act on
the web.
What is augmented reality?
Pokemon Go, filters, you know augmented reality.
But it's the come positing of the digital
world on the real world.
It's similar to VA.
If you take AR and everything is in the virtual
world, all opaque, then that becomes VR.
So, immersive computing is technology that
operates on human immersion.
When that's on the web, we call it the immersive
web.
So, we have been dreaming a lot about the
future of the immersive web five to ten years
from now.
And I'm excited to share with you the story
of our foundational block of work for this.
Bringing augmented reality to the web.
And I'll also talk a little bit about what
can be coming up in the future.
What can be going on in ten years from now
and the challenges it brings and why this
is an exciting time to jump on in new medium.
Last summer Apple released AR kit and Google,
AR core, these are for iOS and Android devices.
Enabling mobile developers to create augmented
reality applications and provides real world
understanding so developing can create content
in order to interact with the real world.
So, developers having easy access is only
one half of the equation.
The other half is do users actually have access
to AR capable devices?
Previously this was limited to dedicated headsets,
expensive dedicated headsets and less stable
platforms.
But these new forms, AR Core and AR Kit only
require censors like gyro scope, accelerometer,
RGB camera, things you can find on most more
than smartphones.
No special hardware is needed.
There are around 600 million AR Core and Kit
devices.
It's the perfect time to start bringing augment
the reality to the web.
And there's like 200 megabytes of animated
GIFs here, so, it might be a little slow.
We built some prototype browsers, web AR on
AR Kit and Core.
Great names.
And these are hackie web views we use to expose
features from AR kit, AR Core to web development.
First, we needed to identify which features
we needed to create these experiences.
We identified three high level requirements
we would need to bring to the web.
First off, we need to know where the device
is in 3D space.
Its position and orientation in order to sync
our virtual world on top of the real world.
VR has similar needs.
You have to track it there.
We were able to leverage a lot of the web
VR API here.
Sometimes this is referred to as six stuff,
or six degrees of freedom.
And that's referring to tracking three axis
of position and three axis of orientation.
We need to expose the camera stream and intrinsics.
Intrinsics in this case are things like the
field of view and perspective of a camera.
Again, in order to sync our virtual world
with the real world.
Mobile platforms only allow one process to
access the camera at a time.
Since our AR platform is using the camera,
we can't use get user media, for example.
We have to have another way to expose this
information.
And finally, we need to be able to understand
the world around us in order to interact with
the real world.
So, there is this broad description, the catch
all term for the ability to interpret the
real world.
So, for example, finding surfaces to place
an object on it or estimating light in an
environment.
So, you can see the brave line up, get scared
when the lights go out.
So, these three components are MVP feature
set.
And we also, of course, had some goals for
this experiment.
First off, general occupy exploration.
We were going to look into the ergonomics
or feasibility of different features that
we do expose.
So, for example, earlier prototypes used Tango
smartphones with infrared depth cameras on
them.
Similar to a Kinect.
So, this gave us around 30,0003D points
per frame to position in the real world and
use that in our experiments.
So, needless to say, pushing 30,0003D points
from the platform to web content and sometimes
in this case to the GPU was not ideal.
So, we moved our initial API from point clouds
was difficult to use in a performant way.
Wees want to discover what's possible.
What kinds of things we can make.
With post tracking in the camera feed, we
can do things like build portals.
And since this is the web, we can use physics
libraries and service detection to make an
AR mic drop.
We have had to build tools as we're developing
these experiences to debug what we're doing.
It was very difficult before we had these
tools to do things like visualize the surfaces.
So, we know are we finding a plane?
Are we miss something did I break something?
What's going on here?
And we also want to get developers excited
about these features.
We wanted developers to bug visor vendors
and say, we want this.
Please implement this.
And convince organizations to check it out.
Maybe this is something we can use.
And see what core ideas people came up with.
So, concurrently, while working on these prototypes,
folks from a handful of browsers that you
see up there, we were working on web VR2.0.
And it was renamed to WebXR to support virtual
reality and augmented reality experiences.
There's now an implementation of WebXR inside
of Chrome canary, I think the latest one,
behind a flag.
And all currently support VR, all these ones,
have pledged to support the WebXR Spec as
well.
After about nine months of prototyping and
working with standards, we now have a working
version in a real browser.
So, kind of, asterisk.
So, when I wrote this proposal for JSConf
EU months ago, I was kind of I had no idea
what state this would be in by now.
But I got pretty close.
It's about like two weeks away.
We're going through the reveal process right
now.
And who knew that shipping code to a billion
plus users was difficult?
So, it would be behind multiple flags.
And the API will change for sure between several
versions.
It's very in flux.
So, now that we have these features enabled
in a real browser, we were able to show this
off at Google IO a few weeks back.
We have this educational site showcasing the
statue from Mexico.
It's kind of like this kind of a Wikipedia
type article with a model on there.
And we wanted to show off the progressive
enhancement concept.
Something very webby.
So, if you're viewing this on desktop or mobile
browser, you can move the statue around with
mouse and keyword.
And see the WebGL content and viewers.
But on a browser that supports AR features
with WebXR, you can place it in the world
around you.
I swear the performance is better than this
GIF looks.
You can place this life size statue and these
annotations you can interact with.
Click on them.
And since it's just the web, render with HTML
and CSS.
All the same web tools you're used to.
When we were showing this off in the web booth,
there's a web booth and an Android booth at
Google IO.
I was at the Android booth, so we had a lot
of Android developers check it out.
And their reaction was they didn't quite believe
this was the web.
So, it was a lot of great feedback and confirmation
that, yes, the web can performantly run WebGL
experiences with AR.
So, I mentioned earlier, understanding.
So, the initial installation of augmented
reality in Chrome only implements one set
of see understanding.
A hit test.
You can array out from the device and interact
with the real world, some surface, a collision.
And this returns a 3D point if found.
So, commonly in applications, you have something
that traces real world surfaces and displace
an indicator to the user indicating there's
a space found here or no surface.
And where that collision occurred.
So, how we implement this radical behavior
is on every frame we cast out array from the
device and see if we get a hit.
If it hits the surface, we change the radical
to provide feedback.
In this use case, we say, hey, we found a
surface.
Tap again to place a model.
So, I'm going to show off some example code
of implementing this radical.
And I don't want to get too deep into this
at all for several reasons.
One, it's not in Canary yet.
And there will be better resources to come
than me talking about it.
Like a Code Lab and other documentation.
This code is the same as if you're doing a
VR experience with WebXR.
We can request a device from navigator XR.
And we need to create an XR presentation context
from a Canvas element.
This is similar to getting a WebGL context
from Canvas.
And this is visible to the user and injected
into the DOM.
And as a big caveat, there's absolutely no
error handling here that should be done.
So, then you also request a session on that
XR device.
Passing in our XL presentation context.
So, this request session is the part that's
not fully spec壇 out for augmented reality
yet.
Ideally in the future this is where you would
request specific augmented reality features.
So, you can kind of react to different environments,
different platforms can support different
features.
And right now, once this lands, if you enable
the augmented reality flag, everything's always
AR.
The camera feed will always be wondering.
And the underlying AR platform is always running.
There's battery implications and privacy implications
for using these features.
So, that will change very soon.
So, this is a lot of code right here, I know.
So, it's a lot of boilerplate.
This will all be abstracted away behind libraries.
But really quickly, we need to set up another
Canvas that we write the WebGL commands to.
So, this gets moved over to the XL presentation
context we injected into the DOM.
This is the weirdest part of WebXR.
You have two canvass.
One that you write to that is not in the DOM,
and one that displays that content that's
in the DOM.
So, it's very didn't than other kind of 3D
stuff.
So, things of note here, we create a frame
of reference.
And for our session.
And then we start a request animation frame
loop using our session's request animation
frame.
You have to hook into the native device in
order to sync everything properly.
So, similar to this is, again, we're working
with 3D.
So, this is similar to other kind of WebGL
3, JS development where we have a render loop.
So, a render loop we get our devices and that
contains like a matrix representing our device's
position and orientation.
And then we render every frame.
Our virtual content that matches perfectly
with our real world.
And then we call this radical update function.
So, this radical is a custom GS class and
is like a disk on the ground that can trace
surfaces.
And in every frame, update function.
I'm glossing over the details here.
And we have an origin point and a direction
vector representing this line coming out of
our device.
And we have a variable hit.
And that's an array of all the hits that we've
found.
And it's easy to not find any hits if we're
in low lighting and there's no kind of discernible
feature on the surfaces we're looking at.
If we found a hit, great.
Continue the 3D position and orientation.
And use that to position our radical.
Maybe we can fade it out or hide it completely.
So, no worries if that didn't completely make
sense.
The WebXR Spec is difficult to use, especially
if you're not familiar with 3D on the web.
I wanted to share the high level concepts
and show what it looks like.
And there will definitely be libraries that
will abstract this away.
So, what's next in the near future?
Like I said, sometime in the next few weeks
AR will land in Canary behind a flag.
And with this flag, like I said, augmented
reality will always be on.
And we want this to be more fine tuned.
So, with the request session API will have
to be fleshed out.
And right now, we're only supporting hit tests.
That abstraction over the underlying service
or mesh data that the AR platform provides.
But in the future, we would like to have point
clouds, services, mesh, light estimation as
well as things like anchors which are abstract
3D points that you can place around the world.
Over time, as the system understands the scene
better, that will update and become more accurate.
And right now, we're hand coding all this
XL stuff.
There's a lot of rough edges.
The spec is still under development.
And we're making tools internally as needed.
So, libraries like 3JS currently support some
XR use cases.
And in the future, we imagine this would also
support AR.
And if you're familiar with Mozilla's A frame
web library, after our prototype browsers,
someone a week later made an AR component
that worked great.
So, that community is very proactive.
We imagine all these libraries will support
creating AR for a better experience rather
than writing using the API directly.
So, what I'm talking about right now is mostly
mobile AR.
And this provides a lot of immediate solutions
for things like shopping, education and entertainment.
And this is only the beginning as the future
gets much more interesting and weird.
So, that's, you know, the future in the next
few months or a year.
What about the next five years?
A little further out we can imagine computer
vision and machine learning being used with
augmented reality.
Mozilla has been working on their own prototype
browsers using AR.
And they have been doing some research with
computer vision within WebXR.
And something that's been talked about a lot
in the AR industry is this concept of the
AR cloud, or cloud anchors.
Which lets multiple users share their virtual
world.
This feature is currently a course and works
for iOS and Android.
But this will be critical to have also on
the web.
Unclear if it will be different services or
part of the native API.
No idea.
But this is necessary and critical for our
user immersive content.
And hardware is making a lot of progress.
So, with the latest generation of VR headsets,
these R and 1 headsets, tracking for VR is
much more accessible now.
Rather than needing a whole room set up where
you're installing I'm not sure what they
call it like lasers to track your movement
and also like an expensive gaming computer,
essentially.
So, we're getting closer to these, like, smaller
accessible rather affordable, comparatively,
headsets that people can use.
AR headsets like HoloLens and others, they're
out and quite expensive and specialized hardware.
And in this head mounted display AR world,
those will be supporting browsers.
So, the specification has to work for mobile.
But it's got to have the head mounted displays.
And the head mounted displays will have paradigms.
The HoloLens has pretty good support for hand
gestures to interact with content as well
as voice commands.
So, we seem to be trending towards that.
Like a limited physical interface and using
more computer vision machine learning to interact
with this virtual environment.
So, further out in the future again, this
is looking at trends, making informed guesses.
But at the end of the day, they're still guesses.
We have no idea.
Head mounted displays will evolve into hip
glasses with these AR capabilities.
Years out.
And we have these socially acceptable headsets
and we can anchor content in the real world,
we can reframe what immersive computing will
look like in the future.
Contextualized information that is highly
scalable.
Discovered serendipitously.
Instantly accessible, secure and interoperable
across devices.
And that sounds a lot like the web.
Imagine walking down the street as contextually
relevant content is appearing.
We're executing content, that's what the web
is for.
And in this future our immersive user agents
have even more responsibility.
It's necessary for users to be able to control
and filter that content.
Especially since content can be obstructing
our vision when we're crossing the street,
for example.
So, you know, that's pretty important.
We need to get that right.
So, these are just some trends based off of
technology.
But in the future, society will drive immersive
computing more so than technological possibilities.
Cultures will define the relationships with
immersive technology based off of the social
intimacy and privacy values.
Usage will be influenced by state's legislation
or indifference on data ownership, net neutrality
and corporate regulations.
A society's economic equality will determine
access to these immersive technologies and
therefore its creators, directions and applications.
And the outcome of each group's sociopolitical
decisions will define their relationship with
this technology.
We can have some societies comfortable with
ubiquitous always on, always recording head
mounted displays in public areas.
And others as needed, education and training.
And they are all intertwined and it's important
to consider all of this within the context
of society.
So, there are a lot of unknowns.
Some are exciting, some are terrifying.
And within these unknowns, that's where we
must exert our influence.
We must design a future that puts human first,
not existing power structures.
And it's important to us.
And to tech critics, sci fi records and groups
underrepresented in tech and ask, how can
this be used for evil?
This ubiquitous always on technology has massive
privacy implications.
And this trend of cloud computing could contribute
to the surveillance state.
Maybe we can steer towards own device computing.
This is being shopped around to police departments
around the world for facial recognition purposes.
This could threaten civil liberties until
or unless our laws catch up with technology.
Unchecked capitalism could result in hyper
consumerism unless we have the ability to
control this content.
And if this technology isn't accessible to
all, then we can further increase the digital
divide separating between those who have and
have not.
So, we must constantly be vigilant as technologists
and steer this towards good.
Patricia had a great talk yesterday on ethics.
All that applies here as well.
But the fun parts, there's a lot to be excited
about.
So, we are in uncharted lands.
We're learning about this medium as it's being
created and as we go.
So, Apple, Microsoft, Google, have all published
these UX guidelines and they're just scratching
the surface of what's possible now and in
the future.
You could invent the next gesture that is
as ubiquitous as scrolling and clicking and
interacting with these virtual objects.
And when we're talking about the web, we have
access to hundreds of thousands of npm packages
that we can use and mix and match in our products.
And there are no immersive technology libraries
currently.
So, you could create the next JQuery or Lodash
or React in these environments.
I was going to say there are no killer apps
for VR or AR, but I played Beat Saber a couple
weeks ago.
It was great.
But what's getting people to jump into virtual
and augmented reality?
To get over the initial friction?
Since personal computing is a multi disciplinary
field, I think we will see a lot of new collaborations.
Like a fashion designer and a historian or
a machine learning expert and a social worker.
Or in this case, an audio engineer and a Jedi.
So, it's a super exciting time to start these
new collaborations as engineers and designers.
Matt has a great article on AR first applications.
And something that stuck with me here is with
new mediums we mostly just port the previous
or older mediums into it.
When mobile was the new thing, we mostly just
had tiny websites on our phones.
It wasn't until we started leveraging location
or constant access and always connected devices
that we truly leveraged the platform with
things like Twitter and Lyft.
And same with augmented reality.
A lot of augmented reality applications are
just putting 2D content around the world.
A lot of it's like, this could have been a
website.
This would have been easier.
So, we're still discovering what are the applications
that truly leverage this new medium?
And that's it's still important to play
around which explore and have fun with this
medium.
So, before I'm done, I'll leave a few links,
the immersive web community group is on GitHub.
You can follow along with the spec.
And Mozilla's A frame is a great technology
that uses web components to create VR and
augmented reality scenes.
You don't have to be familiar with 3D or WebGL
or any that have stuff.
And the immersive web weekly is a newsletter
I started a few weeks ago that's summarizing
all that's happening in this fast moving space.
Be on the lookout for AR landing in Chrome
Canary very soon.
And hope you all explore the future.
Thanks.
