[MUSIC PLAYING]
PATRIK WESTIN: Hi, and welcome
to this breakout session
on the Cloud Identity
and Access Management.
And today I'm going
to focus on how
to manage your cloud policies
efficiently and at scale.
So in this session, I'm breaking
it out into four pieces.
We're going to start with the
core concept of Cloud IAM, what
it means and what we do.
Secondly, we're going
to have a deep dive
into a bunch of best practices.
How do you actually
manage at scale?
Third, we have a customer story
from Credit Karma coming up
the stage, Matt, explaining how
they manage their resources.
And last, we have
a super secret,
not yet launched feature, that
you're getting a sneak peek.
So take a step back.
And what is Cloud IAM?
What do we do?
We answer the
question who can take
what action on which resource.
This other policies
in cloud which--
people sometimes get
confused and think, oh,
is this identity?
No, it's not.
So org policies,
specifically, are which
configurations you can have.
Another example is quotas,
which is how much money can I
spend, how much have
I spent, where do we--
who pays the bill
for this resource.
So the first, we're
doing a deep dive
into the core
concept of Cloud IAM.
Cloud IAM provides a
fine-grained access control,
and we do that in
multiple dimensions.
And some things I'm going
to talk to you today about
is appropriate if you're a
garage, you're a startup,
or you're just a happy student.
Other things are focused
on large-scale deployments.
So we start with the
core, core concept.
Permissions-- everything in IAM
it's based on IAM permissions.
They have a special structure.
They start with a service name
followed by the resource type,
and then the verb or the action
you can take on this resource.
Let's take an example,
storage.buckets.create.
It's obvious by the name
of these controls who
can create buckets.
There's two others that are
very common, two actions.
And that is get IAM
policy and set IAM policy.
They exist for
every resource type.
And I know I'm biased,
but in my mind,
these are the most
important permissions
because these are
the permissions that
allow you to grant more
permissions to other people
or to yourself.
So we'll get back to
that in later slides.
But remember,
these are powerful.
Of course, creating
or updating a bucket
could also be critical if
you have something secret
in that bucket.
However, the set IAM
policy, because it
allows you to
grant new policies,
is order of magnitude
more powerful.
So let's take it
from the beginning.
An API is called.
Say we want to call compute
engine to get an instance.
You call it.
It will be translated
from the back end
to check against IAM to see,
do you have this permission?
In this specific
case, we're going
to check if you have the
compute instance get permission
on instance number 1.
If you do, you will get
the data you ask for.
If you don't, you will
get an access denied.
So that is a simple API.
Now let's make it more fun.
Other APIs take a lot
of different variables.
In this case, we are going
to one API call comes in
to compute engine,
it's going to translate
into four different permission
checks on different resource
types.
And for this API call
to be successful,
you need to have all four
permissions on those resource
types, or this API
call will fail.
So now some of you are in the
audience think, like, hey,
permissions?
I don't deal with permission.
I've used IAM before.
I only care about the roles.
And to some degree
you're correct.
Roles is the abstraction we
use to grouping permissions
together.
So you can think about this
as performing certain tasks
or certain workflows that
permissions belong together
because it allows a
user to accomplish
a set of tasks or workflows.
So we have actually
three different types
of roles in Cloud IAM.
We have primitive roles,
which we have there.
They're very easy to use.
We actually don't recommend
that you use this in production.
These are very, very powerful
and it makes it very easy
to get started.
Again, if you're a student
or are you just tinkering,
feel free to use this.
If you're working for a large
bank or something like that,
please don't use this.
Instead, you should
use predefined roles.
They are narrower.
They group things normally
bound per service.
And there's several,
several hundreds
of these predefined
roles that Google
have defined that we think
that, oh, these belong together.
This can accomplish
certain tasks.
There is a few patterns there
that, basically, every service
have an admin and a viewer
of a predefined role.
There are many
specialized roles as well,
but could also be that your
specific use case calls
for something more specialized.
We call that custom roles.
Custom roles allow you to
combine any permissions
together and create anything
you want, and say, for me--
for my use case--
I will have this workflow.
And I need my users to have
this set of permissions
because, to me,
they go together.
Maybe not go together
for all of you,
but for your specific use
case that makes sense.
Custom roles, while they
can be used for anything,
there's three very
common use cases.
These two combin roles, you
look at the predefined roles
that we have worked, that
different teams heave
worked hard to create and say,
oh, I like these two roles
and I'm always granting
them at the same time.
Let's combine them.
Any other one is, like,
hey, I like this role,
but it has this powerful
permission I don't like.
Let's remove it.
Or this role has everything I
need but this one permission,
and you can add it.
So these are the most
common use cases.
However, you can combine
anything you want.
So now we talked about the
permissions and the roles.
And now we need to
bind these to users.
We call these bindings.
Obvious name, right?
And you can bind it
to an individual user
or to a group of users.
That group can also be a group
of a group of a group, nested,
and everyone in the whole
chain of those groups
are going to be granted.
The third type is
a service account
and we're going to come
back to that later slides.
Services account is
basically an identity
of a service or a robot.
And all of these bindings can be
combined so you, in one policy,
have multiple bindings with
multiple different roles
for multiple different people.
And then, when you combine all
of those, you have a policy.
So for you that like JSON--
this is my favorite example.
So you can actually write this
and use command line APIs.
And in this specific example,
we grant the compute instance
admin role to two
users, to a user
that is Mike and a group
that is the admins.
We say, this is my
powerful role and I
am very selective on
who I grant this to.
I have another role that is
less powerful, compute viewer.
I can grant them to
everyone in my domain
or I can grant them to a
specific service account.
So we have now talked
about our permissions
and how we can bind
into roles, how
we bind them to
users, our members,
and now we talk about hierarchy.
So all of these can be
granted in hierarchy.
And this is a core core to how
Google do identity and access
management.
You can grant these
policies, literally,
anywhere in your
resource hierarchy.
If you start granting something
in the org level, super high
up in your hierarchy, it
will be very powerful.
The higher up, the more
powerful because the permissions
is inherited down.
So everyone in your org, every
resource that is ever created
would be affected
by this policy.
Versus if you do
the other extreme,
you put a policy on the
individual resource.
Now only that resource will be
granted, that specific role.
That is also not the best
practice because odds
are you might be
successful one day
and end up having
millions and millions
and millions of these
resources, and now you
have millions and millions
of policies to manage.
So the best practice
is to think about how
you in your organization work.
What makes sense?
And I'm going to come
back in the best practice
and do a couple of deep
dives in exactly why we
use the hierarchy and how we
recommend reason about it.
I told you I was going to
come back to service accounts.
So think about service account
as the identity of one service.
And this is best practice
that this service account
should take the identity of
a specific service you have.
This service account can
manage and be the identity
of multiple virtual machines.
You can have multiple
service accounts
inside the same
project and these
can be granted any other role so
they can act on other projects
and on other resources.
So in this example here, you
see that this service account
acts on buckets
in other projects
and that's totally fine.
And you can say one service
account access one bucket
and my other service account
access my other bucket.
And this is where people
sometimes get slightly
confused, I have to admit.
And that is that these
service accounts,
they are both an
identity by themselves
and a resource, which
means that you as a user
can grant access to the
service account themselves.
And the service
accounts is an identity.
So there's one powerful
permission, especially,
that I'm going to
come back to, as it
relates to service account,
that is very, very powerful.
So just save that for now.
Now we're going into our second
part, the best practices.
So say that you're
starting up and you just
had your first data
scientist join the company.
You don't have
that many resources
and you want to grant her
two different roles for two
different resources.
And you might be
inclined to say,
oh, yes, let's put Alice in the
binding for these two policies.
That's not best practice.
If you instead,
from day one, create
a group called Data
Scientist in this example
and you add Alice to that group,
once your usage in Cloud grows,
it might be that you add many,
many more resources and many
more policies.
And then, as you
are very successful,
you hire another data scientist.
Now instead of finding all of
the places where I've granted
Alice permissions, you
just add your new hire
into that Data Scientist group.
And you know that now
Alice and the new hire
will have the same permission
to the same resources.
This allows you to
manage at scale.
In the beginning, I talked
about the super powerful set IAM
policy permission.
And as they are so
powerful because you
can grant additional
policies, additional
permissions to other
people and to yourself--
remember that.
This is very powerful.
It doesn't just leave it to you
to grant other people access.
You can escalate your
own privilege, which
makes this very, very powerful.
So think in this hierarchy,
who do grant is permission to
and how high in the hierarchy?
Remember, the higher in the
hierarchy, the more resource
they can grant access to.
The other one, as I
mentioned, is the act
as service account
permission, which
means that the user can
act and do everything
as the service account can do.
So the best practice here is
to grant this on the service
account themselves, and
not up in hierarchy,
even though you will be
very tempted because it's
convenient at this
point in time.
Because later in
time, also, you're
going to create more
service accounts.
And now that grant is
automatically granting those
users to act as all of those
new service account that
didn't used to
exist but now might
have accessed on sensitive data.
So this is the one case
where you really, really want
to put your policy on
the resource itself.
In this case, the
resource is an identity,
which is the service account.
And we have seen some of these.
You know, it's very convenient.
There's an API.
It has a free field,
say, a description.
You can put anything in there.
Please don't.
If you have secrets,
treat them as secrets.
Put them in a storage bucket
and put the privileges
and the permissions that is
appropriate for that secret
bucket that you have and not
try to code it in as metadata
in other policies.
And this leads us
to leave a trace.
Say that you just created your
super secret bucket with all
the company secrets, and
someone has access to it
and you don't know who.
So by default all changes,
admin changes to the cloud
is audit logged.
So if you do a set IAM policy,
it's going to be logged
and you can see
who did that, who
granted the
permission, who changed
the policy, that is logged.
But we don't know
by default that you
have a super secret bucket
and you care deeply about it.
So in this IAM policy.
You can also turn
on audit login.
And you can log
when someone change,
in this case, this bucket
or if someone access it,
like who reads this bucket.
Because, remember, it's
my super secret bucket.
If you want to retain these logs
for a longer period of time,
you can export them to cloud
storage or to big query
if you want to query them.
There's also centralized logging
systems such as Stackdriver
that can trigger events
and have things happen
when certain logs come in.
And I know, I'm biased.
I hope this is big
enough for you to see.
Every week IAM publish on
new permissions and roles.
And regardless if you use the
predefined roles or the custom
roles, a best practice
is to visit this page
and see what has
changed for two reasons.
If you use the
predefined roles, we
will tell you that
this coming week we're
going to add permission to a
role that you might be using.
And you're like, oh, that's good
to know because, effectively,
this new permission's
being added into my system.
The other way is you
use custom roles.
And there's a new API or
a new argument to an API
that you really want to
use, but your custom rules
didn't have that permission
because we just made it.
So you have to manually
read through it and say, ah,
I need to have this
permission to my custom roles
so that my developers or
my robots or my service
can take and call this new
API or use these new arguments
for this API.
Bring your organization
structure to the cloud,
and this is just practical.
It's you setting
up your company.
You have thought
hopefully deeply about who
reports to whom in reporting
chains and, if something
breaks, who do you go to.
If your cloud resource
hierarchy doesn't match that,
how do you now find the person
responsible to grant you
more privilege?
Something is broken, might
be broken in your production
system, and you need,
temporarily, more access.
Who do you ask?
If this match how your
company is set up,
if that person is your
manager, that's a great thing.
I promised some deep dives into
how we leverage the hierarchy.
And this is the first
of the three examples.
So the setup is that
we have developers
and IT admins, and we have the
IT admins that have projects
and, in this project, they
have the shared images.
Their role and their
work, actually,
is to manage these images,
make sure they're up to date,
they are patched whenever
there's a security
incident, someone publish,
whatever operating system
you're using, a new image.
So you want to grant the IT
admins the instance admin role
so they can change and update
all images under this project.
Notice this is on
the project itself.
So as images comes
and goes, you don't
have to change your policies.
It's an important detail there.
And on the project, you also
grant the developers image user
roles so that all your
developers can use these images
in their projects.
And then the developers
themself in their projects
will have instance
admin, and they can then
use the images that the IP
admins have created and managed
for them.
Let's take another example.
We have the same setup--
developers, IT admins.
In this case, however,
it's different
because we have
different subnetworks.
And some subnetworks
are made for developers
to play and experiment
and develop.
We have other where running
your sensitive workload
in production.
So instead of granting
the developers up
on the project level,
say, use any subnetwork
you want, you will
grant them the policy
on the subnetwork
itself that you
allow for them to develop on.
For the developer, no change.
They will grant the
same instance admin
on their projects.
And whatever time they
create new projects,
that can use this subnetwork
too in this example.
Let's make the hierarchy
a little bit deeper.
We have a finance team.
First of all, we have a
company that is larger.
We start to have divisions.
Someone in this division
is responsible for managing
your budgets.
How much can we spend?
You have subteams.
The different teams can
see how much they spend,
but not how much the
other teams spend,
and they cannot change
their own allowance.
They cannot set
their own budgets.
You want the finance
manage to do that.
So you will grant the finance
manager for this division
the billing account
admin role, as high up
in the hierarchy as you can, in
this example, in the division.
And again, as high
up as you can,
you want to grant the
billing account viewers
to the teams that are
affected, so team A and team B,
you put those in and what
we call a folder level.
This allows any project
a team B creates
in the folder to automatically
have the correct policies.
Instead of having to change
and remember to go back,
for every project that team
B creates in the future,
to add themself the permission
or the policy to see this.
You just put it one
level up and its there.
And now we are growing
the company even larger.
And is like, oh, I'm
doing these workflows.
And you will find that you
have a very common patterns
that, in this example,
you create the project.
You might create an app in-game.
You want to enable billing,
you want to enable some APIs,
you want to create
service accounts,
and you want to
set some policies.
And it's like, oh,
that's a lot of work.
And you might even want
to create custom roles
that you add to these
policies, all programmatically.
Of course, you can go out
and write your own scripts
and do exactly the
things you want.
Or you can use the
Deployment Manager.
It's a declarative tool
that allows you to focus
on what's important to you.
What's the properties
of the virtual machine
you want to configure?
You can say, I want
my virtual machine
to have this much memory.
I want this type of CPU.
And you actually no longer have
to care about how this happens.
You go to Deployment Manager
and say, this is what I want.
And you're setting
up these flows.
And then you're done.
It's a fully hosted service
and it abstracts a way that--
it makes sense when you're
having the same workflow,
creating a lot of
resources at scale.
And now we have storytime.
So Naveen from Credit
Karma is coming up
to tell us about how they
leverage IAM and the policies.
NAVEEN GUDDATI: Thanks, Patrik.
Hello, everyone.
My name Naveen Guddati,
working for Credit Karma
as a senior security engineer.
So today I'm here to
share the Credit Karma
story of implementing identity
access management in the Google
Cloud Platform.
So before we jump
into the details,
let me just give you
a brief background
on what we do at Credit Karma.
So we actually offer a
lot of financial services
to our members like
credit cards, auto loans,
personal loans,
recommendations, et cetera.
So our core mission
at Credit Karma
is to help our members make
financial progress by utilizing
all these services.
We have about 80 million users
who subscribe to Credit Karma
and using the services.
So it is an absolute
requirement for us
to host these services
on a platform that
is much scalable,
efficient, and secure.
So when we started thinking of
migrating to the Google Cloud
Platform, we started off
with some initial goals
that will help secure our
infrastructure in a better way.
We wanted to restrict the
access to the JCP console
so that we would allow
only a specific set of one
team or individuals accessing
this JCP environment.
We also wanted enable
this [INAUDIBLE]
single sign-on process
so that our employees can
use their own
enterprise credentials
and doesn't need to use a
different ID or anything.
And at the same time,
we wanted to enforce
these multi-factor
authentication checks
and perform the authorization by
implementing the JCP role that
has access controls.
And we would like to reduce the
burden on our administrators
without manually approaching
and introducing the JCP platform
and just by automating it, and
minimize the human activity
in the JCP environment by
leveraging the JCP automation
model.
So how do we do this?
The first step of the
process is authentication.
We know that JCP is powered
by the cloud identities that
can be created via the G Suite.
So what we did was we
procured the G Suite
and we hosted all of
our identities in there.
So G Suite also supports the
single sign-on capabilities
that can be enabled via SAML.
So we took advantage of that
having this SSO enabled,
we were able to delegate all of
our authentication activities
to our own [INAUDIBLE]
IDP platform.
And that also gave
us the flexibility
where you can enforce additional
security restrictions,
like enforcing MFA
and also making sure
that users are coming from
the appropriate networks
while accessing
these JCP resources.
Additionally, as a
best security practice,
we wanted to enforce the
limits on the session
length of the users.
And G Suite has the
flexibility for that.
So if a user exits a
particular configured limit,
they will be forced to
reauthenticate and identify
themselves.
So having this
business in place,
we were able to secure
the path of accessing
the JCP for the employees.
Now let's take a look
at how this flow works.
Employees on the
Credit Karma side
will be accessing the
IDP portal and they
will be authenticating to this
IDP portal with the credentials
that [INAUDIBLE] services, which
is also hosted in our premises.
Once the user is successfully
authenticated to the portal,
our platform is going
to generate a SAML token
and force it onto the G Suite.
Now, here this G Suite, which
is acting as a SAML service
provider, verifies
affects the authenticity
of the token by
verifying the certificate
signature and
identity [INAUDIBLE]..
And if everything is
well, then G Suite
is done redirecting the user
to the Google Cloud Platform
by using this [INAUDIBLE].
So this way, we were able
to secure the path of access
into Google Cloud
Platform, making
sure the user is going through
all the security checks
that we enforce along the way.
Now you have the authenticated
user on the JCP platform
and the next part is
the authorization.
How do we do this?
So in JCP, when we are trying
to authorize the users, first
of all, we would like to enforce
the concept of least privilege,
which is by using the predefined
roles or the custom roles.
But when do we use these
predefined roles and when
do we use these custom roles?
Let's take an example.
If a developer is
requesting an access
to one of the JCP resources
like the storage bucket,
as Patrik mentioned, we
will be assigning them
with three different roles
like our storage bucket
owner or the writer,
or even the admin role
so that the developer can get
all the permissions that he
need where he can update objects
inside the bucket or delete
objects inside the bucket.
But these predefined
roles at sometimes
gives you an
additional privileges
of setting the IAM
policies, which
means where he can delegate
the access to some other users
as well.
And we don't want it to
do that because now we
want it to enforce the concept
of segregation of duties
because we have a separate
team who is actually
performing that access
management activity.
And this developer was supposed
to only work on the bucket
by updating objects
inside the bucket
or deleting or creating
the new buckets.
So this is where we started
creating the custom roles.
And creating the custom roles
is very easy in the JCP.
You can just create it
from an existing role
by unchecking the permissions
that you don't need.
Or even you can
create a custom role
by selecting each and
every granular permission
that you need.
And once we have
this exercise done
of granularizing
the existing roles
or creating all the
required custom rules,
then we started off with
creating the Google Groups.
And as a best
practice, we always
map our roles to
the Google Groups
so that we don't need to manage
this access activity at the JCP
IAM.
We don't need to change
these policies every time.
And we can do that actually
at the Google Groups.
And we always try to
avoid assigning these JCP
roles to the individuals
in the JCP IAM platform.
And then granting privileges
only an as needed basis.
This is to make
sure that when we
are trying to elevate the
privileges of a developer
for a specific
direction, we don't
want to keep these privileges
on a forever basis.
We wanted to revoke them once
the developer's job is done.
And this is a concept where
we implement this privilege
bracketing.
And the next thing is
auditing and monitoring.
We use the tools
like [INAUDIBLE]
which continuously keep
track of all the policy
changes happening on
the JCP IAM and also
notify us with any
violations happening
there so that we can go back
in there and then fix that.
This way, again,
implementing these concepts
and having this the role best
access controls in place,
we were able to address
the users who were
accessing the JCP resources.
And finally, the
JCP automations.
We use these JCP automations
to effectively manage
our infrastructure
resources and projects.
Again, we follow the best
practices recommended by Google
while implementing
these automations.
We started off with creating
an apparent hard node
at the top which
represents our enterprise.
And then each and every
department in our enterprise
is represented by
using the JCP folders.
Once you get into
the department,
as department own multiple
applications and the product,
we created a separate project
but each and every application
that we hosted in there.
And finally, each
and every product
is assigned with all its
required JCP resources.
So just to sum it up, we
followed these best practices
where we delegate the
authentication [INAUDIBLE]
to our IDP platform.
We used the predefined
roles or the custom
rules most of the time and avoid
using the [INAUDIBLE] roles.
And we always map these
roles to the Google Groups
and then enforce the concept
of segregation of duties.
And all of these
things together will
help you achieve the
least [INAUDIBLE]
in the JCP environment.
So Patrik is going to share a
few more interesting updates
in the next slides.
Thank you so much.
[APPLAUSE]
PATRIK WESTIN: So I promised
a sneak peek of something
that we haven't announced yet.
You're first to hear
it in this room--
IAM Conditions.
And we have the some
of the developers
who built it right here,
so a shout-out to them.
So what is IAM Conditions?
It's a new feature that
allows you to use information
from the client, such as
what IP address was used,
and say, oh, if you're coming
from this IP address range,
you can do more powerful things.
You can-- say, if the resource
has a specific pattern,
like if it starts with a
specific few characters, say,
dev or prod, they're different.
And you can combine
multiple of these attributes
with Boolean expressions
and combine it
into very elaborate things.
So we're going to dig into
some of those examples.
So say that you have
a trusted device
over your corporate network.
And you can now grant a policy
to the editor role that is
powerful and say, I trust you.
Same user.
But if you're working
from your desk,
you can edit and
manage this resource.
But the same machine,
but you take it home.
Maybe you work from home.
It's still your trusted device
that the company gave you.
You now get the viewer role.
You can only view this resource.
And if you're going
to an internet cafe,
you still log in.
It's still you, same
correct credentials,
and you will not get
any access at all.
Another one that people
are asking for a lot
is expiring policies.
So you can create a policy
that has a time bond.
It starts at some point and
it ends at a different point.
So I said that this is
a Boolean expression.
You can express this in
common expression language.
It's a relatively new language
that Google have introduced.
It optimized to make it
easier to understand,
to write and reason
about your attributes.
And we have also
announced access policies,
which is a different
concept in cloud.
But these access policies have
something called access levels.
These access levels can be
combined with IAM Conditions.
So requests can be classified
with different access levels,
again, depending on how safe
you think this access is.
And then you put those in your
policies, in your bindings.
So how would that looks like?
Let's go back to one
of my earlier slides
with the you have a storage
admin, a powerful role.
And you have Mike
and admins again.
But this time, I want to
restrict it to only work
for my corporate network.
I would go and set
up access levels
that I define what that means.
What does corporate
network mean to me?
And now this will grant only
the storage admin this access
if they meet that requirement.
So we have some key takeaways.
We want to grant roles to
groups, not individual users,
use the hierarchy.
Think about where
in the hierarchy
do I want to grant
this permission.
Is it a powerful one?
Is it weaker one?
Use the correct role type
and the correct role,
for that matter.
If you're doing a big
production system,
use the predefined role
or your custom roles.
Don't put any secrets
in your metadata.
Turn on audit logging for
your sensitive workloads.
Once you grow in
scale, start thinking
about how to automate to
minimize the human errors.
And lastly, bring your
organization structure
to the cloud.
And this means both in
terms of the hierarchy.
It means how you create roles.
How do you name your roles?
Name them things that make sense
for you and your workloads.
How do you work in your company?
Do you have a role that
you call data scientist?
Maybe you create a role
named data scientist.
Maybe have a group
for data scientists.
Suddenly, it makes sense, and
it's much easier to manage.
[MUSIC PLAYING]
