VIJAY BANGARU: Good
morning everyone.
Welcome to our session.
My name is Vijay Bangaru.
I'm a Product Manager
on Google Docs.
And with me today I have one of
my colleagues, Eric Bidelman,
who will also speak.
And we also have a friend of
ours, Matt Tonkin from Memeo,
who will also help you
understand new ways that you
can connect your enterprise
applications with
Google Docs and sites.
So before we jump into it we
are our usual live notes and
questions are going to be at
the Wave, so pull that up.
Get your questions in there
and get them voted on.
We'll address all of those
at the end of the talk.
So today we are talking about
the Documents List API, which
is the programmatic way
for you to access your
user Google documents.
And how many people here
have used the API or are
somewhat familiar with it?
All right, good.
A pretty good bunch.
We're also going to talk about
the sites data API, which again
is the programmatic way to
access Google Sites and this is
fairly new, it came
out in September.
And how many folks
have used that one?
OK, good so we got a
good mix of people.
And as you probably know these
are both part of the family of
Google APIs, the GData APIs.
So were going to start
with a little bit of
an overview on those.
So this is a common
standard for Google.
A ton of our applications
offer this API and it ranges
in terms of offering.
In addition to the Docs and
Sites ones you have your
YouTube one, you have your
Analytics one and so on.
So there's a lot of
great things here.
The protocol itself is all
based off of common standards
and open protocols that we
all know and love already.
So you have your restful API
with your get/post/put/delete
for your full CRUD operations.
You have the ATOM publishing
protocol and we do
extend it in a few ways.
Most importantly where you're
extending the opposition and
versioning into that protocol
so that that tricky stuff is
just built into the standard
already and you guys don't have
to waste your time building it.
The protocol itself of course
is XML based, but we provide
client libraries in just about
every flavor you could want.
And we find most of the
developers are just using these
libraries instead of speaking
the protocol directly.
And Memeo is a good example of
this and they'll talk about
that in a little bit as well.
And in addition to the
libraries that you can find at
code.google.com we got the
documentation and samples.
So we hope we have it set up
so you can hit the ground
running on these APIs.
So once you do have the APIs
you ask yourself, what
can I build with them?
And really the answer is
pretty much anything.
Especially with Docs and Sites
we try really hard to make it
so that if you wanted for
whatever reason, you could
rebuild our applications
completely on the API.
The same functionality you get
in the app you're going to
be able to get in the API.
So that means you can
start building various
mashups and stuff.
And one I thought of this
morning that I wish I had is
you know, when I look at my
weeks calendar and there's
several meetings in there I
wish there was a button where
it just auto-generated meeting
notes and agendas
and stuff for me.
So it looks, it'll find my
appointments, it'll find the
contacts in there, it'll
create a docs, set the title
appropriate with the same name
as the calendar invite, share
it to everyone and we're done.
And if that does it for my
entire weeks of meetings I
just saved 30, 40 minutes
of work probably.
And so that's a very simple
case of things you can do
combining these apps to provide
nice experience for your users.
So why Docs and Sites?
You guys probably know this
already, but when you get down
to it in terms of a
productivity suite that gives
you collaboration and
accessibility this
is one of the best.
In terms of collaboration, we
keep a single copy that is the
master copy in the cloud so
you don't have to e-mail
versions back and forth.
You don't have to worry about
if you have the right version.
All that's going to
be managed for you.
And in terms of accessibility,
it's in the cloud.
If you can access the
cloud it doesn't matter
which device you have.
Doesn't matter where you are.
It doesn't matter what time it
is, you have your content.
And that simplifies even
mundane decisions like when I'm
about to get on the plane,
which of these seven laptops
tablets, iPads and
phones do I pick?
It doesn't matter.
They can all access it,
so you just pick one
and you're good to go.
And you combine into that that
these are web applications.
So the great thing about web
applications is you have a
really nice feature
development cycle.
We push features all the time.
We push bugs fixes
all the time.
The pace is much, much faster
than a traditional client app.
And then you add in the cost
and you put all those things
together including the low
cost, we're finding that a ton
of users are gravitating here.
More are coming each day.
So as developers, this is a
great platform for you guys
to develop on to capture
some of these users.
So we'll jump into a bit on the
Documents List API itself.
Last year when we were here we
announced a bunch of features.
We had folder management,
sharing, some more metadata,
some export capabilities,
advanced queries and in that
year we've been really busy and
a big milestone we had was
Version 3.0, which came
out in September.
So in addition to features and
bug fixes we added speed for
both you, the developer and we
also did refactoring for
ourselves to make it easier to
launch more features and this
should be apparent hopefully as
I walk through some
of these slides.
So 3.0 this was in September.
Some of the headline features
here is additional support for
people writing sync clients and
when Matt's up here he'll
tell you all about that.
The ability to upload
and download PDFs.
And so those are some
of the features.
we included shared folders and
in terms of sort of bug fixes
and rounding out the corners
you used to just be able to
move a document to the trash in
the API, you couldn't actually
get it out of the trash can and
away for good, so we added
permanently deleting documents
and then the nicest feature
there, speed. 40% faster.
So we're very proud of that.
Well we didn't really
rest on that.
So since then we've added
a bunch of more stuff and
we're going to go into
a lot of these details.
We added OCR, we added
translations, we added
resumable uploads, so you can
upload large files to the
cloud and then the ability to
store those files there.
So upload any file, which is
one of our headline features.
So before we go into the
details on those features
we should talk about
where the magic happens.
And the magic for this API all
happens at the feeds level.
So docs.google.com/feeds.
And basically we have a set of
feeds that'll give you
different bits of functionality
and it's all pretty
straightforward.
So the main feed is going
to let you create batch,
update all your documents.
Your basics CRUD
operations are there.
The ACL feed is just
like it sounds like.
This is going to let you do
all the sharing operations
that you'd like to do.
The media feed lets you update
the documents content.
The revisions feed,
surprisingly enough.
It lets you access
the revisions.
So pretty straightforward
stuff here.
Foldering stuff's going
to let you folder.
The export feed is going to let
you point at a particular item
and export it in the format
that the user wants, so we have
a variety of formats for
each of those types.
And the metadata feed is going
to give you information on what
features are enabled
for that user.
And Eric's going to go
into detail on that
one in a little bit.
So jumping into the
advanced features.
The first one I wanted to talk
about is document translations.
So when you're uploading a
document you can specify source
language and target language
and we're going to translate it
so the copy you get on the
server is going to be
the translated version.
If you choose you can just omit
the source language and in
most cases we can detect it.
And the rare case we can't we
throw an error and of course,
this is all documented
up on code.google.com.
And not every source target
pair is supported today, so I
think if you're trying Estonian
to simplified Chinese you
might be out of luck for now.
But you know, we're
working on that.
Next one, upload any file, so
this is one of my favorite
features, which we
launched back in January.
And in a way this is a natural
progression for Google Docs.
We started with having our own
native formats for documents,
spreadsheets, presentations and
we added the ability to upload
and download PDFs and we
added shared folders.
And we found in our internal
use and talking to customers
and developers that a lot of
people have built their entire
team collaboration around
the shared folder.
All the stuff for the project
is there and then they're
running into hurdles.
So we talked to an architect
who said, look, all my plans
and stuff are there, but I have
pictures of the build site.
It'd be really nice if I could
put those pictures there.
And we talked to people who
were in mixed deployment setups
where half the company uses
Google Docs and some are still
using Microsoft Office and
they're like, look, I've got
this sales deck, it's our
master sales deck,
it's 90 megs.
No one wants to
touch it anymore.
We'd really like to just put
it up there and be done.
And so to support those
scenarios we said, yeah,
that makes sense.
Let's make it so you can
host files up in the cloud.
And like I said, we did
that in January and we've
gotten great usage.
And the usage falls into
three models and this gets
interesting when you start
thinking about what kind of
apps you want to build.
First is a simple collaborative
model where previously with
your team you'd e-mail the
document back and forth and
edit it and have to keep it in
sync and someone's working
on the old version and
everything gets complicated.
That's kind of gone
out the window now.
People are just uploading
it into the cloud.
There's one place.
It's always up to date.
The other thing is
non-collaborative.
So it's just me and I'm
thinking, man I'd really
like to work on this
file when I get home.
So I e-mail it to myself
and then access my e-mail.
Well, that's just
kind of stupid.
The other case is I have files
I'm worried about losing so I
stick them on a thumb drive or
I e-mail it to myself as the
back-up, just in case,
which again is silly.
Upload it to the
cloud, you're done.
That's what we kind
of built this for.
The third case we're seeing is
a lot of developers like you
guys building synchronization
clients, which are synching my
cloud content to my PC, to
multiple PCs, to different
devices and Matt's going to go
into that use case quite a bit.
We found that that's a
very compelling use case.
Some details on this, so while
you are uploading, normally
things are uploaded, they get
converted to Google
Docs format.
If you specify the convert
equals false flag we're just
going to upload it as is.
So the exact ones and zeros
you give us, we'll give back.
These files do take up quota,
so it's not free storage.
And there's a few limitations.
And one of those is the API
access for this is for the
premier customers only.
The max file size
is one gigabyte.
And everyone gets one gig free
quota and beyond that you can
pay more at fairly
reasonable rates.
Next feature I want to talk
about is OCR, so the same idea.
It's a feature flag on upload.
If you're uploading a photo and
you want to extract the text
you can call this option and
it's a simple, in the post
URI, you say OCR equals true.
And instead of giving you a
photo on the server side we're
going to give you a Google
Document with the
extracted text.
There's some limitations here.
First OCR is fairly
expensive to do.
So it does take awhile
and you want to program
around that constraint.
In general, OCR is garbage in,
garbage out so the better
quality image with the more
readable characters, the
better your results will be.
For now this works only with
the Latin character set and it
is an experimental feature, so
we do have it throttled a bit.
But your paying apps users
have a much higher throttle
point, so we don't expect
to see any problems there.
So for this one I'll do a quick
demo on this and this is up on
the codes samples page as well.
So I pulled up the demo page
and I'll go through my usual
sign in process and yep,
I'll grant access.
OK, and so now I will choose a
file and if you want to play
with this we actually have the
file right here for you so you
can download it and try it out.
Pull that up.
276
00:12:44,92 --> 00:12:44,84
OK.
And we'll say start.
And so like I said, it's going
to take a little bit of
time because it is an
expensive operation.
But I think you can see some
of the neat things you
can build around this.
If you were for example,
building an expense report
system you can set it up so
that people can just
scan their receipts.
You can extract the text
and toss it into the
program for you.
So here you see what
we pulled out.
It's pretty good if you compare
the photo and what came out and
it even calls out places
specifically where it was
unable to find or figure
out what it was.
We think that'll be really
useful as you build
some of these apps.
And with that I'm actually
going to hand it over to Eric.
He's going to walk you through
more of these features and then
we'll call up Matt and he's
going to go into some
details on Memeo.
So, Eric.
ERIC BIDELMAN: Thank you Vijay.
Great, so some more advanced
features for V3 of the API.
One thing that we launched in
regards to the arbitrary file
upload that happened in
January was a metadata feed.
And this really came out
of third-party requests
from our developers.
They wanted to be able to
determine what type of
count a user was using.
As Vijay mentioned the upload
any file is restricted
in regards to the API to
apps premier accounts.
And we didn't really have a
great way to determine that.
But what we launched
recently is a metadata
feed and it's read-only.
And what you get back you
can see it's fairly simple.
You send a post request to
the metadata feed URI.
It's a private feed so
you're going to include an
authorization header and
then a version number.
So we're talking to version
3 the API with that
GData version header.
And what you get back is just
an entry with a bunch of
metadata and properties and
features of the account.
So there's things like
quota bytes used.
You know, how much
quota is it taking up.
Is the OCR rate limiting.
This particular account that
made it's request you can tell
is a premier account because it
has the upload any
feature name set.
Then you get other properties
such as the upload size
for each type of document.
There's also, what's not
pictured here is the export
formats for each document.
So as you know, Google Docs
allows you to export documents
and spreadsheets into
different formats.
PDFs, Microsoft Word formats.
So, very useful for customizing
a client UI based on the
feature set that's
making that call.
Another great feature and very
useful future if you're going
to upload gigabyte files to
Google Docs you're going to
need a reliable upload method.
So there's the 250 megabyte
limit on the normal upload
path if you've ever
used the API before.
But what we actually launched
recently was a resemble upload
protocol for the Google Data
APIs, so right now the YouTube
API and the Google Docs
List API implement this.
And this is just a more
reliable, more robust feature
for uploading large mush files.
And what your client does is
actually make a post request
saying I want to start this
process and then you get
back a URI to upload
chunks of a file to.
So you can upload a
megabyte at a time.
You can upload 10
megabytes at a time.
It's really up to you
and your client.
And I'll lastly just say that
the client libraries support
for this is really great.
So we actually implemented this
in all the client libraries so
you don't have to go into
the raw protocol itself.
But let's just take a look and
see what that looks like.
So bare bones, the raw protocol
the, the GData protocol for
this is you make
a post request.
You say I want to start
this process to the
create session URI.
We'll include the
version header and the
360
00:16:02,6 --> 00:16:02,06
authorization header.
And then the two or three
important parts of this
particular request here are the
x upload content type and the
x upload contnet length.
So this is the file.
We're going to creat a Zip
file in Google Docs and we'll
tell it the file size and
the applications type.
And then the title of the
file will be my title
when it's done uploading.
So like I say, what you get
back is this unique location.
It's a token based session URI
that you're going to send the
chunks of the file to so
you'll make multiple put
requests to this URI.
You'll send a constant type
just so the server knows you're
still talking about a Zip
file and the important bit of
this is the content range.
So this is the first 10,000
bytes of this Zip file.
And then of course the body is
the 10,000 bytes that you're
talking about of the file.
You'll keep doing that.
The server will
respond with a 308.
It'll tell you it's
not complete.
It'll tell you the content
length and the range that it
knows about so maybe just
because you sent the first
10,000 bytes doesn't mean the
server actually processed that.
So you'll want to continue the
file with the range header
that you could back.
And so what you get back
eventually when the
file's all done, you've
uploaded the total file.
You'll get back a 201 created
with the ATOM entry, the title
and the quota bytse used.
There is quota bytes used in
this case because it's a Zip
file, it's an arbitrary file.
If it was just a regular
doc file that we converted
that would be set to zero.
So that's the protocol, but
everything's easier in Python
if you're a Python fan.
So let's dive into
this a little bit.
This a what this looks like.
I mentioned client library
support, so the first thing
you're going to want to do is
of course import the Google
Data Docs client library.
Both the client and the
data portions of that.
And the process is really
the same for all the
client libraries.
You'll create a Docs client
and you'll pass in your
applications name.
You'll need to authenticate and
authorize this app to share
data with Docs, so you'll
either use oauth or
authsub for that.
And I left that as a to do.
Then we'll do some file I/Os.
So we'll read an mpeg file.
We'll get the file size
information from the operating
system and then the client
library provides this really
nice robust resumable
uploader object.
So you're going to create that.
Pass in the client and file
handle will tell it the
MIME type, will tell
it the file size.
This has a nice property, so
you can set the chunk size.
So this is the content
range bytes that I was
explaining before.
And then the desired class
is just the type of ATOM
entry, the GData ATOM entry
that we're expecting.
So we're talking to
the Doc List API.
You'll expected a Google Docs
entry If you're talking to
YouTube you would expect
a YouTube entry.
The next thing we do is we can
set the title via the ATOM
title and then call the
uploader.upload file method.
And really in the background
all this is doing is it's
making that initial post
request that you saw in the
previous couple slides.
You're passing in the entry and
it'll literally just in a four
loop make those put request and
chunk the file up into
different pieces based on
your chunk size there.
So about 15 lines of Python
to get the resumable
upload protocol within
your application.
Really simple, but really
robust and definitely the way
to go if you're uploading
large files to Docs.
Lastly I just printed out
the title and the quota
used for that file.
So next I'd like to invite up
Matt Tonkin from Memeo who has
some really great real world
experience implementing some of
these protocols and he can tell
you some challenges and
some tips and tricks.
MATTHEW TONKIN: Thanks Eric.
So what we've heard about is
the ability to be able to
upload arbitrary files
to Google Docs.
Now that's a great feature,
but you can't edit all
451
00:19:35,44 --> 00:19:35,36
those files online.
There simply aren't the
online editors for it.
So Memeo Connect we designed as
being the bridge between that
storage that's available in
Google Docs and your
desktop software.
You know, so if you drag and
drop a Photoshop file over into
Memeo Connect you can edit
in Photoshop locally.
And when you edit it
those edits are then
pushed up online.
And if something's been edited
by somebody else that can
actually edit with you that
gets taken down from online and
sent back to your client.
It's available for
Windows and for Mac.
And we have an iPad
reader as well.
But the big challenge
for us was sync.
It's not just about getting
documents up into the cloud
and bringing documents
back down again.
You have to know
when to do that.
As soon as there's an edit
locally in the document,
what we need to do is then
go and upload that and
replace the document.
And then download if somebody's
edited the document, like I
said, download it back
to the client again.
Now in Google Docs there's
no set way to be able
to handle conflicts.
And there is the
potential for conflicts.
And the last thing you really
want to do is lose users data
because they tend to not like
that very much at all, but it
comes down to how it is that
you're going to deal with it.
That's a challenge your
application to be able to
figure out how you're going
to handle conflicts in
the context of your app.
A lot of the solutions that we
used were really pretty
straight out of the box as far
as the API is concerned.
Etags.
So Etags are like a little
string that's attached to a
document and that Etag will
change, it'll cycle, do its
thing, change every time
there's a change
on the document.
So if there's any metadata
changes, if there's a title
change, if it gets starred, if
there's an ACL change your
Etag will always change.
The thing that doesn't
change is the updated date.
The updated date will only ever
change when the actual content
of the document changes.
So between those things you can
pretty much figure out, OK,
when do I actually need to
download the file again and
when I just need to go and
update the metadata
for the file?
The document feeds, when we
talk to the API the document
feeds pretty much just gives a
list of documents that we have
available in Google Docs.
But that list is growing
and it's growing at a
very, very rapid rate.
Thanks to things like
arbitrary file upload.
Thankfully the Document
Feed API is a rich
querying language.
So we can specify parameters
and get back just a list
of matching documents.
And that's really quite
powerful because what
we can do is say, OK.
We're going to just look at
the documents that have
been changed since
a particular date.
So since we last went and
refreshed that document feed
let's just grab those
challenges and bring that
down to our client.
So a few tips and tricks, a few
things that we learned while we
were developing Memeo Connect.
One is that Google Docs
is not intended to
be a file hierarchy.
It's really a document library
much like iTunes is a
library for your music.
Folders are like playlists.
They're virtual entities.
So an individual file or
a document can exist
in multiple folders.
The problem there is that
basically it doesn't map
to a local file system.
You can't just sort of go
OK, we're going to take
this and we're going
to mirror this online.
We're going to take this
old file system and
mirror it online.
It just doesn't work that way.
And basically you shouldn't
try and force it to do that.
I mean, you can look at your
cat and expect it to bark, but
it's just not going to bark and
it's just not there to be a
hierarchical file system.
It's just not.
The next thing is list size.
We had a customer who
originally like back when we
first launched the client,
said OK, we've got this
arbitrary file storage.
I'm going to grab my whole life
and upload it, so you know,
like six gigs, and four and a
half thousand files and it got
85% of the way through and had
a bit of a hiccup and we all
thought gee that's really good.
We did really well on that one.
But you know, ultimately it
just means that people are
uploading more and more files.
Now that they can upload
arbitrary files the
documents list is getting
much, much larger.
And it can take some time
to go and get that initial
feed of the documents.
So optimizing the list size,
optimizing your feed request
is really important.
So just bring down, you know,
figure out what queries you
need to do to bring down the
smallest feed possible.
Persistence was a big
one for us as well.
If you do have a large number
of documents you really want
your client to launch quickly.
So again, that comes down to
optimizing your feed requests,
but if you can persist all
that information, persist it.
So the next time you go and
launch your application the
things going to launch quickly
because there is a bit of a
delay as it's updating
the documents list.
I suppose get lazy.
Your users generally don't
need to know everything
that exists by the APIs.
Get lazy is really just
lazy load with the things
that you don't need.
You know, these servers they
respond pretty quickly, but if
you don't need it at startup
then just grab it
when you need it.
So a user clicks on a file then
just go and update the stuff
that you need in your
GUI at that time.
Flexibility was a big
one for as well.
We found that there just
simply isn't a standard
Google Docs user.
No two Google Docs users are
the same from what we can tell.
Some people have 10 documents.
Some people have
10,000 documents.
Some people want to
see everything.
Some people just
want to search.
Some people want to
share everything.
Some people don't share
anything at all.
So the trick is just to never
assume that you know how
a user's going to use it.
We even found this
internally with Google.
In that the feedback we got
some groups are we really love
this picture and some other
group said, oh, we really
love this feature.
We even found that within
Google they don't use the
Google Docs in the same way.
And now with this arbitrary
file upload we can expect a
lot more aggressive usage of
arbitrary files in the future.
You might be using the Google
Docs List with some purpose,
but the user might have
thousands and thousands and
thousands of files, so it's
something to be aware of.
Informative UI.
I mean, there's some lag in
dealing with the servers.
These things aren't local.
So the server interactions can
take some time and it's good
to keep your users informed.
You know, it's really important
for a great user experience
to keep your users informed.
So you know, if you're updating
a feed, if you're downloading a
document, if you're uploading a
document keep the users
informed all of the time.
And the last point is
client libraries rock.
I mean, for us what we used
mainly was the .NET and
objective C client libraries.
And they're essentially object
orientated wrappers for all the
things that exist
in Google online.
And they're wrappers, meaning
that you then don't have
to write all that code.
You know, the client libraries
gave us a really, really
advanced starting point.
And one of the great things
that we found in doing both our
Mac client and our iPad client
is that there's this underlying
library that goes between
those two platforms.
So we were able to write an
engine, a Memeo Connect engine
that sits on top of that and
we're able to bring that form
when we launched Memeo
Connect for Mac.
Then the iPad came out.
We thought, oh great.
Because it's written on the
same library we just picked up
the same code base, brought it
across to the iPad and we had
an application up and running
in a very, very short
period of time.
So the client libraries we have
found have been really great
and really, really good
support for the Doc List.
A couple of other points is
that this stuff was really
built for the web and not
so much for the desktop.
And throughout the course
of development we did
have a few challenges.
We went back to Google a few
times and we really found
that the API team was
very, very responsive.
Particularly when you're
talking about you know,
sometimes you think you open a
bug online, what happens to it?
It goes into nowhere.
Well, it certainly
doesn't go into nowhere.
It exists.
And we found that the Google
Docs List team very,
very responsive.
And the rules are there,
if you don't ask, you
won't get, so ask.
It's like a lot of things that
we've seen come out have been
from, we really push it and ask
for a lot of new features and
found them to be
very responsive.
So we're at today, really, and
this is the call to action that
you can now store arbitrary
files in Google Docs.
You can store any file you want
in Google Docs and that's
a pretty powerful feature.
So Memeo Connect is just the
way that we have chosen
to use that API.
When you go to the developers
and essentially now this is
arbitrary file storage in the
cloud, so the world is really
your oyster in that sense.
You've got to look at what your
applications are and how you
can make use of this API.
It's very flexible and there's
a lot you can do with it.
And to close off we will be
launching Memeo Connect 2 in
beta form quite shortly.
And that has our number one
most requested feature,
which is some integration
into the [? ice ?].
So you know, we see essentially
a drive that's for Google
Docs and we can drag and drop
things into it and out of it.
It's going to be a closed
beta to start with.
So if you want to get a hold of
this software and help us to
squash out show some of the
final bugs, you can go to
673
00:29:43,64 --> 00:29:43,52
memoconnect.com/beta.
All right.
Back to Eric.
ERIC BIDELMAN: Great,
so thank you Matt.
So before we jump into and sort
of switch gears and talk about
the Sites API I just want to
show you-- Memeo is a great
example of a desktop client
that talks to the API.
I'd like to show you what a
third-party developer did and
he came to me in the forums.
In his spare time he's used
the Java client library.
He's used Google Web Toolkit
and App Engine to create a
really interesting use case
for the Doc List API.
So Vijay mentioned this API and
gave you some of the features.
That it's very full-fledged.
Let me just sign
into App Engine.
He mentioned it's very
full-fledged and you can
literally rebuild Google Docs
on top of the API itself
and this is what this
gentleman has done.
So you can tell, if you've ever
used Google Docs before, I'm
sure a majority of you have,
but this looks and feels
exactly like Google Docs.
He's customized the UI.
He's added widgets, Google
Web Toolkit widgets.
You can have operations like
save, you can create new
documents, you can delete
documents, the revision
history, everything available
in the API as you'd expect.
But the really cool integration
here is that he's created a
latex or LaTeX, depending on
where you're from, compiler.
So he's using Google Docs
as a storage platform
for his application.
So I can make edits
in his application.
Just to show you this
works it does compile
the latex file here.
But if I save this and open my
Google Docs it's using Docs as
a repository for his data.
So there's his file,
it's in Google Docs.
I can come in here as the
different user, maybe
share this further
with somebody else.
They can make changes and
then his application
will pick that up.
Just a really
interesting use case.
If you take anything away from
this it's that the APIs
extremely full-featured and you
can literally rewrite Docs or a
different use case for
your application.
So switching gears a little bit
I'd like to dive into the sites
API, but before we do that
here's the bit.yl URL for the
Google Wave if you do have any
questions on either API or
anything that we're talking
about or questions for Memeo as
well, please do fill those out.
So the Sites API, how many
people have ever used
Google Sites before?
726
00:32:16,818 --> 00:32:16,758
Great.
Awesome.
So the API itself has full
access to a Google Site.
This is something we'd launched
back in September with
all these features.
It's very robust.
So you can create, modify,
move, delete all content in a
site and we'll take a look at
what that means in a second.
Similar to the dock list API
it's got very similar features.
You can have sharing
permissions.
You can share further with the
site or different attachments,
arbitrary file uploads for
sites, file cabinets or
attachments to pages as well.
There's revision history and
activity streams so you can
actually audit who's doing what
and what they're changing
on a Google Site.
You can create and provision
new sites if you're
743
00:32:56,424 --> 00:32:56,22
a Google Apps user.
You can copy then from an
existing site, so you don't
have to start from scratch.
There's other features,
more advanced features.
Page-level templates, there's
site-level templates,
web address mapping.
This is really cool if you want
a map a sub-domain, an actual
sub-domain you have on your
account to a Google Site and it
looks very seamless, right?
You couldn't even tell that
it was a Google Site.
Couple more things.
Gadgets.
This is the data API that
we'll talk about today.
But there's also gadgets, so
you can have an iGoogle gadget,
Sites itself is a gadget
container, so you can run a
gadget the talks to the sites
API or does something
interesting for users.
And there's also Google Apps
Scripts, so it's JavaScript
that's run on a server site
within Google Spreadsheets
that you can use to
interact with Google Sites.
So a use case there might be
you have a Google Spreadsheet
with a bunch of user names and
a list of data and you could
with one click import that into
a Google Site and into
a Google List page.
So just as Vijay pointed out
for the data API for the Doc
List all the magic happens
at sites.google.com/feeds.
So a couple feeds worth
mentioning are the
site feed itself.
This is how you're going to
create sites, this is how
you're going to copy sites
and modify the titles
and URLs for a site.
There's the content feed.
So you're going to be creating
content, different types
of pages and sites.
It all happens at the
content feed level.
There's full CRUD for that.
The two parameters here
are domain and site name.
The domain is your Google Apps
domain or literally the string
site, if you're on just a
regular Google account.
Site name is the URL web space
name of the site as you would
see in the browser if
you opened up a site.
ACL feed, very similar
to Google Docs, right?
You're going to sharing
at the site level now.
Hopefully the page
will eventually.
Activity feed, I mentioned,
this is read-only.
The revision feed, same
as the Google Docs API.
So just a ton of stuff there.
So content, what kind of
content just so we're all
on the same page, what
am I talking about here?
Well, this is a public
site that I set up prior
to the presentation.
You can feel free to visit it,
but this is an example of
a webpage in Google Sites.
So you have access to
the HTML of the page.
There's the attachments
at the page level.
You can add comments,
change comments.
The activity stream, this
is how it manifests itself
within the Google Sites UI.
And that maps to the
activity feed in the API.
But there's not just
webpages, right?
There is list pages, so
tabular and arbitrary
storage structured data.
There's announcement pages.
There's file cabinet pages
where you can upload
arbitrary files in a
section of a certain site.
In order to do, we'll take
a look at Java example.
Creating content
is very simple.
So the first thing, just with
the resumable upload example
you're going to be creating a
site service object to talk to
the sites API and we'll need
to use oauth or authsub to
authorize that access because
we're actually writing data.
And the first thing you can do
is create a skeleton function,
so we'll make a create webpage
function that takes a string
for the title of the page and
we'll take an HTML string
for the body, the HTML
content of that page.
And what we're expecting
back is a webpage entry.
This is something if you've
ever used the Java library that
821
00:36:10,852 --> 00:36:10,838
you're very familiar with.
So the process for all these
different types of pages
and content is the same.
We'll create a blank object.
This case, a webpage entry
object to store our data
and create in Google Site.
We'll set an XML blob, we'll
set the content and the title
on the entry, the ATOM entry.
Lastly, it's just a matter
of calling the client
insert method.
This will create that post
request and actually
upload the entry to the
content feed URI there.
And for this public site,
in my example, this is
the content feed URI.
It's fairly straightforward.
Great.
So once you create site, create
content, you're going to
want to be able to fetch it.
And the sites API itself
has a really robust query
structured language.
So you can imagine using
the kind parameter,
getting just webpages.
If I want to get at webpages
and list page entries,
absolutely possible.
If you know the URL of a site
you can use the path parameter
to drill down into that
actaul entry itself.
The parent parameter can be
used for fetching children
of a certain page.
So you have a list of files
in a file cabinet, you can
get at all those entries.
And then just lastly,
there's metadata properties
for all of these.
So if you mark an entry or an
announcement or comment as
draft or if it's a deleted
entry you can include those
as part of your query.
So once have content in
it's absolutely your data.
You can get it out in
any way, shape or form.
So in Java this is pretty
straightforward as well.
We're still talking to
the content feed URI.
And I just changed the URI to
include the kind parameter so
we'll get at the file cabinet
pages and the lis
pages on a site.
We'll call the get feed
with that URI and expect
a content feedback.
And then we'll just loop
through and printout the file
cabinet entries and we'll print
our the list page entries that
are returned to us
from that query.
Certain content types are
actually going to have
different metadata properties.
So for example, the list page
entries have the column name in
the column name and the index,
so we can just print those out
as well looping through
the column items.
So instead of throwing more
code at you I kind of want to
just demonstrate on how easy
and quick it is to create a
Google Site from scratch.
So what we can do is actually
I'll fire up Eclipse.
And even before I do that I'll
just show you-- this a test
domain app, testdomain.com
that I'm using.
Like I said, creating Sites is
only something you can do with
an apps domain at this point.
So I have three
test sites here.
Let's go ahead and create
a new one from scratch.
It'll be kind of
boring at first.
We'll change the theme.
We'll add some users.
We'll share and
collaborate the site.
We'll make it public.
We'll share it to the domain
and make it public and then
add a bunch of content
and upload some files.
So I have some helper logic
here-- and I'll scroll in
for you guys-- that
sets all this up.
You know, I renamed a bunch
of command line options.
Basically I'm setting
up an Auth subtoken.
I'm setting the domain and site
name that I'm going to be
talking to via the
command line.
I have a couple methods here to
build the content feed URI and
the site feed URI to
pull the site data.
But if I scroll down here to
this section this is actually
where we can create a site.
So just as in creating a
webpage entry you're going
to want to start off
with a blank site entry.
We can set a title for this
site, a summary, a description.
The site name itself is
the URL, what we want
it to appear under.
And then we'll make that post
request using the insert method
just as we did with creating
the webpage content.
It's going to be boring at
first, so the next thing I can
do is after the server returns
this new site entry here to me
I can call the update method.
I can set a theme and call the
update method, which would make
a put request and change
that theme of the site.
So let's just see what
this looks like.
So we go here and authenticate.
I'm just setting an Auth
subtoken then I set up
918
00:39:58,538 --> 00:39:58,058
prior to this show.
So we create a site.
If I hit return we can
change the theme, make it
a little more interesting.
So if I refresh my list of
sites this is the title
of the presentation.
So there we go.
We created a new site from
scratch, a few lines of
code and I actually
changed the theme.
So before was just a
cookie-cutter blank theme
and we made it a little
more interesting.
Let's just go to show you what
it looks like to add certain
people and share this outside
of our domain and make
this site public.
So by default, Sites
has this set up.
The permissions associated
with this site.
So we can add everyone at
Apps Test domain to be a
collaborator in the site and we
can also make the site public.
So you in the audience can
actually view the site after
I finish that process.
So that code is very
straightforward.
Here we're creating an ACL
entry and to share it with the
domain we'll use the ACL type
scope for the domain and pass
in the Apps Test domain as our
value for that right here.
We'll set the roles writer
to mean collaborator.
We'll set a link, we'll call
the ACL link and call insert on
that, which is the site ACL
link for that particular entry.
To make the site public
is very similar.
So instead of using the domain
type we'll use the default
type meaning everyone.
And we'll add them as
readers and call insert.
So we'll press enter a
couple times and the site
should now be public.
So if I refresh this indeed
those things have changed.
So if you actually want to
visit this site it's
sites.google.com
/a/appstestdomain.com Google
I/O 2010 in the URL here.
So that's great.
Well, I don't want to
work on this myself.
I've added collaborators.
Let's actually add
some content.
Right now I had the
home page by default.
So to do that I can use
the same processes you
saw on the slidees.
I'll create webpage entry.
I'll create a comment
on the page.
And an important note here for
the comment is that the webpage
is going to be the parent
element for that page.
So we need you to
include the link.
You see the site's
link real parent.
Let me call that out.
Here, so this is setting an
ATOM link telling yes, I want
this webpage to be the parent
page of this comment
and that'll make it a
child of that page.
Again, we'll call answer
and then we'll create a
file cabinet as well.
The last thing I can
do is-- jump ahead.
We'll create a
couple attachments.
I'll actually upload this
presentation, PDF version of
this presentation and all the
code snippets you saw in the
presentation to this and I did
share it publicly so you should
be able to access that.
I'll just read a file of a
bunch of directories, a
directory of files and create
a couple attachment entries.
Setting the media source to
that file and media type.
Set a summary and
description in the parent.
So we'll do that really quick.
Comments posted.
File cabinets created.
Moving the page to be a subpage
of the webpage I'll upload
a bunch of attachments.
So you can see I have a new
webpage here, I set the
title and the content.
The blink tag doesn't
work in sites.
Oh, well.
Nobody likes the
blink tag anyway.
I posted a comment, created
a subpage with the
file cabinet page.
Unfortunately it doesn't look
like my attachments are being
uploaded, but I'll definitely
get those up there because they
do have the code snippets
for all the presentation.
There they go.
One by one.
So there's the example.
I'm just uploading a single
file to a file cabinet.
So if we take anything away
from that it's that it's very,
very easy to work with the
Sites Data API and you really
have access to all the content
within a Google Site.
So I think now I'm going to
throw it back to Vijay.
He's going to wrap things
up for us and we'll take
questions after that.
VIJAY BANGARU:
Great, thanks Eric.
So I asked this sort of at the
beginning of the talk, what
can you build with them?
And so we're going to revisit
that and hopefully you've seen
with the APIs that you can
build really rich, compelling
experiences for your users that
will hopefully delight them
and make you guys some money.
There's some more things
we want to do in the near
future to make this better.
We want to make it a richer
platform for both the
developer and the user.
So the first thing we're
looking into is better
integration with the
Apps Marketplace.
Of course, today you guys
can put your apps in the
Marketplace and they're there.
Domain admins can find them and
install them and all of that,
but we want to go
a bit further.
We want it so the app you
write, it's file content,
whatever data it stores, that
state is visible in the
users Doc List, right?
And then from there they
can access it and give
you more hooks like that.
And so a screenshot example of
one thing we're thinking of is
when a user goes into his Doc
List and sees your type, in the
context menu there's a new open
with feature and he's going to
see all the applications
installed to his domain that
are eligible to edit
that MIME type.
Whether it's a facts service,
your own editor or a
converter service.
And then if he goes to more, he
can go to the Apps Marketplace
and see what else is available.
So we're hoping with Apps
Marketplace and some UI
integrations like this there's
going to be some really nice
things for you guys
moving forward.
And on the topic of moving
forward, you know,
where is this going?
We're hoping that with your
help we can build a platform
where users documents and
files, they just never
leave the cloud.
End to end that's where they
are because you can create them
there, you can edit them, you
can collaborate on them, you
can store them and share them.
It's all in the cloud and then
you add in the access from
anywhere and your kind of set.
And if we build this platform
correctly anyone can
play in the cloud.
So all the new web apps you
guys write, all the new client
apps that you guys write,
those are all available.
And it's not just new stuff.
It's the legacy stuff that we
can together bring all the
legacy stuff into the present.
So if you look at something
like Memeo Connect that's a
great way to turn Windows XP
into knowing about the cloud.
It brings stuff that doesn't
even know about it, doesn't
even get the cloud into
the future where we are.
And same at the
application level.
If you look at an application
like DocVerse, which is a
company we recently acquired,
it's an application that turns
Office 2003 into a
web compliant app.
So that those files as you're
editing them in Office 2003
they're stored up in Google
Docs and they're shared
in Google Docs and is
available to everyone.
So that's the platform we want
to build and a key component
of this platform is sharing.
We know sharing is key and
people want good collaboration.
That's something
we're committed to.
And then in general, Google
Docs and Sites, these are web
apps, we're innovating as fast
as we can and hopefully you saw
that with just the API
progression in one year.
We're committed to this.
We want this to work and
with your help we can make
some great things here.
And specifically with Google
sites we do have a lot of
more stuff coming here also.
In terms of work spaces.
Two things people have asked
for and it make lots of sense
and we're going after is
a publishing workflow.
So if you want workflow in your
work space that's something
we're going to support.
We also want deeper integration
with the core Doc product
because there's always
been this weird division.
We want to get away with
that so it just makes
sense as one core suite.
Additionally, we're going
to do more themes,
more extensibility.
There's going to be some great
stuff coming in this space.
So with that I want to move
to your guys' questions.
So we have two
microphones here.
They're like within 10
feet of each other.
I don't know why'd we
need 2, but we got them.
We also have the questions
up on the Google Wave.
So go up there and take a
look at the questions.
Vote up the ones that
you think are good and
we'll address them.
And I want Eric and Matt to
come up and join me and
we're also going to
have Scott Johnston.
Our special guest.
The PM of Google Sites.
So he'll be able to
answer stuff as well.
Before we go to questions
though a quick
mention of resources.
Everything you'd want,
code.google.com.
I think you guys know the links
and of course the samples
are at the samples page.
Eric's going to pull up the
wave so we can take questions
off of there and we'll start
with the first question
from the mic.
AUDIENCE: Hi, my company in the
Netherlands uses a lot of XML
and I was wondering whether
we could use Google
Docs to upload XML?
And that Google Docs
validates it against an XSD.
And we would really like to
have the option that when a
user clicks their XML document
that we uploaded into Google
Docs, to be able to
automatically open
it with our app.
So what about the XML?
Would it be available in Google
Docs or is it not something
that Google wants?
VIJAY BANGARU: Yes so the XML
file today of course you can
upload and store with those.
Some of the integration points
that we're trying to work on
with Apps marketplace and
third-party apps would
support that exactly.
Where a user could register
your app as the app that opens
XML types in Google Docs.
And every time they click on
that your app is launched.
AUDIENCE: But is it already
possible to upload XML and that
Google Docs recognizes it as
XML or is it just like
the LaTeX example?
VIJAY BANGARU: So you can go
ahead and upload the file
and it's going to show up
there as filename.xml.
For the files today we don't do
any type specific activities,
so the user will see
.xml in their UI.
We don't do any special
things yet for that type.
AUDIENCE: Are you planning to?
VIJAY BANGARU: We want to and
we're hoping this third-party
integration is going to help
us do a lot of that stuff.
AUDIENCE: OK, thanks.
VIJAY BANGARU: Let's take
the next one from the mic.
AUDIENCE: I work with the
spreadsheet feeds and I
have everything working.
I can pull in the content of
cells from a spreadsheet.
However, it was mentioned that
different Docs users use the
different tools differently.
The formatting within
Spreadsheets is something that
users use a lot to communicate
information and right now I'm
not able to read the formatting
of each cell off of the API, is
there any plan to introduce
that into the Spreadsheets API?
VIJAY BANGARU: I'm sorry.
I couldn't hear
the second half.
A plan to do what specifically?
AUDIENCE: I'm unable to
read the formatting.
For example, the cell shading
or underlining from--
VIJAY BANGARU: I see.
So that's a very reasonable
feature requests.
And I personally don't
know what's in the pipe.
But it seems like
something that should be.
ERIC BIDELMAN: I think at this
point the Spreadsheets API
itself is more about getting at
the data and querying
1170
00:51:20,16 --> 00:51:20,14
for your data.
It's the same thing
with the Doc List API.
So it's more about file
management, sort of actually
getting in and editing a Google
Doc, but absolutely a great
feature request that
we should consider.
VIJAY BANGARU: OK, so the first
question off the [? dory, ?]
are there any plans to expose
Google Drawing through
the APIs right now?
There isn't.
I think basically
the same answer.
It's something we definitely
need to do and we
hope to address--
ERIC BIDELMAN: Yeah, I would
just say that as features
become available in Google Docs
we'd liked to have them in the
API at one point or another.
A good example is the OCR
feature was something available
in the API, but not in Docs.
You know, folder sharing
came to the API before
1188
00:51:59,42 --> 00:51:59,4
it was in Docs.
So there's this different
back and back and
forth that happens.
But yeah, we do like to keep
those up to speed and as users
prioritize that in our issue
tracker we'll take a look.
AUDIENCE: Question related to
the OCR piece of the Docs, any
plans to include something like
forms processing that only
certain parts of the
docs can be scanned?
Or the images that are being
uploaded rather than the
whole image being OCR?
VIJAY BANGARU:
That's a good idea.
We will definitely get
that feedback back
to the forums team.
Let's take the next one
from the [? dory arc. ?]
Eric, this looks like
a question for you.
ERIC BIDELMAN: So are there
any plans to provide ODBC
connectivity for Docs and Site
scripts to have direct access--
SCOTT JOHNSTON: There's an
app scripts session this
afternoon that I definitely
recommend coming to.
That's possible now and we'll
be spending a lot of time this
year in beefing up app scripts
and Docs and Sites integration.
1209
00:52:57,278 --> 00:52:57,058
So the answer's yes.
More detail later
this afternoon.
ERIC BIDELMAN: He did link to a
post that we just made that
announces JDBC support
for app scripts.
VIJAY BANGARU: So this
afternoon more detail.
ERIC BIDELMAN: 3 o'clock.
VIJAY BANGARU: So thank
you guys for coming and
especially thank you
for waiting into lunch.
ERIC BIDELMAN: Yeah,
[UNINTELLIGIBLE PHRASE]
