>> Thank you.
Good afternoon, everyone, it is really fantastic
to be here today.
Like Meri said, I am Katherine Daniels and
I'm going to be talking about growing an engineering
organization with effective devops.
I am beerops on Twitter.
I complain about beer a lot.
I am the coauthor of effective devops which
was published with -- really, it's got an
unshaven yak on the cover.
That is our official animal of devops now.
So I've got a fair amount of stuff to cover
today.
I'm going to be covering the four pillars
of effective devops that we go over in our
book, going to be talking about organizational
growth phased and the different challenges
that come with the different phases in an
organization's life cycle and then I'm going
to kind of go into how we approach scaling
engineering at Etsy, a little bit of real-world
story time.
So in our book, Jennifer and I describe what
we call the four pillars of effective devops,
we did a bunch of research and investigation,
formal and informal case studies and we found
that the organizations that were doing devops
most effectively had these kind of four things
in common.
So we start out looking at collaboration,
and when we talk about collaboration, we're
talking about individual people working together.
Usually on the same team with shared interactions
and input, building towards a common goal,
usually the common team goal, what is your
individual team trying to work on.
Which might sound fairly basic, but it's pretty
foundational, because if you don't have people
on the same team who can work together, how
do you expect to have multiple teams, multiple
parts of the org being able to work together
effectively at all?
Next up we have affinity.
And when we talk about affinity we're now
talking about inter-team relationships, relationships
in between different teams, than the intrateam
collaboration that we started out with.
So we're trying to now develop empathy and
trust in between different teams, different
parts of the organization, in support now
of shared organizational and business goals.
So this is kind of what people most typically
think of when they think of devops is you
have this affinity between your dev team and
your ops team and they're going to work really
nicely together.
Next up we have tool usage and the way that
we view tools is that that we view them as
accelerators of culture, they can enhance
a culture of collaboration and affinity that
we've talked about, or they can, if not used
properly, they can take things in the opposite
direction.
Tool usage can kind of make or break the way
culture kind of goes within an organization.
So we don't view any one particular tool as
being key to doing devops, you don't have
to be in the cloud, you don't have to be using
Docker, Docker, Docker, Docker, tools, it's
more about how you use them, than any particular
tool that you're using.
Tools do not replace culture and tools will
not fix a broken culture.
If you have an organization where people are
refusing to talk to each other, if they don't
trust each other enough to communicate with
one another, switching from IRC or HipChat
to Slack is not going to change that.
Now, don't get me wrong, I love Slack, used
to be a hater, I am now on the train, it's
awesome, you can make emoji for everything,
I love that.
[applause]
But if people aren't going to talk to each
other, using a different chat tool isn't going
to fix that.
So tools can be really easy but they're not
a Band-Aid.
They're not going to help you fix any larger
underlying issues.
And finally we talk about scaling.
What we are talking about is how to apply
the other three pillars, the collaboration,
the affinity and the tooling throughout various
inflection points or stages in an organization's
life cycle.
Another thing that Jennifer and I go over
in the book is what we call the devops compact.
So the four pillars we view as techniques
or principles that you can apply in order
to grow and maintain a devops culture, but
what is it that you're trying to grow or maintain
and the way that we describe this is with
metaphors, because who doesn't love a good
metaphor.
Jennifer and I have both used some rock climbing
in various points in our lives so we decided
to use rope rock climbing as metaphor.
There are a few different aspects of doing
this kind of climbing of a much more likely
where you're going to have a successful climb,
where you know you get to the top of some
rock and nobody gets hurt.
There have to be shared goals.
Both of the climber and the belayer has to
understand what the problem is that you're
trying to solve.
Which route are you going to climb?
You have to have ongoing.
Is belay on and they double-check the knots
on the rope and the climber will tell the
belayer, yup, I'm ready to go, I've got you
and this continuation will continue throughout
the climb.
If the climber feels like they're a little
uncertain about what the next move they're
going to do is and they want the belayer to
be extra careful or to take in slack in the
rope, they're going to communicate this as
this is going.
This is ongoing.
They don't just talk at the beginning.
And finally there are dynamic adjustments,
so as the climb is progressing, different
things could happen, you know, there could
be an injury, the weather could all of a sudden
change and you decide, you know, it's not
safe to keep going.
We're going to have back down early.
And again the communication is going to help.
And the climber and the belayer are both working
together towards this shared goal that they
have defined.
So an organization needs all three of these
things in order to be most successful with
any devops initiatives.
There have to be these shared, clearly defined
goals.
You know, the typical what is wrong with the
dev team and the ops team that made devops
necessary is very conflicting goals, with
the dev team wanting to ship things really
fast and the ops team being oh, God, please
don't touch the servers, they're on fire.
There have to be shared goals, they aren't
going to be in conflict with each other and
everyone has to be on the same page as to
what those are.
There has to be ongoing communication within
teams and on different teams because if you
don't have that, you'll run into the situation
where you know the right hand team doesn't
know what the left-hand team is doing and
you have two teams working on the same thing
or team A thought team B was going to do the
team and then team B thought the team A was
going to do the thing and nobody does the
thing and managers are mad and that's not
great.
There has to be this dedynamic understanding.
We're changing goals and priorities on you,
you have to be able to adjust and react to
those.
So when I mentioned scaling, I talked about
the organizational life cycle.
So what even is that?
We've got these principles that can hopefully
make an organization effective throughout
this life cycle, but what does that look like?
So a bunch of people have done a bunch of
research that I don't have time to go into
fully, but there is a graph that looks something
like this, that you can find on the internet.
Organizations start out pretty small, you
know, little startups, they grow, they eventually
hit some kind of maturity, and then one of
either of two things will happen: They will
either go into a decline where they become
less relevant, they're not shipping as much
as they used to, losing customer market share,
what have you.
This usually starts happening when people
start thinking about their own individual
goals than the shared organizational goals.
The other thing is that an organization can
go into a state of renewal.
That's usually a little more tricky.
There's a lot of challenges that you have
to overcome in order to get to that point,
and I'll kind of go over what some of those
phases and challenges might look like.
So Larry Griner defined these originally five
phases of growth that every organization goes
into.
He defined each one as a period of evolution
or slower change followed by a crisis and
then a revolution required to kind of overcome
that and get to the next stage.
So the first stage is growth through creativity.
This is when you have, you know, a very, very
young startup, you know, people in a coffee
shop or a basement somewhere hacking on something
really creative, really informal, trying to
get something out the door, trying to get
their idea work.
The end of this phase is usually a leadership
crisis where if this is successful and they
manage to make a thing happen at some point
they're going to find the need to have some
sort of formal management structure, because
otherwise it's just going to be chaos all
the time.
In early stages like this with the creativity,
there's usually a lot of collaboration, so
you have these people working really closely
together trying to get their idea to work.
Once there is some sort of formal management,
we move into growth through direction, so
leaders are now working to now direct the
company.
You'll have things like budgets and road maps
and spreadsheets.
More formality as an organization.
Now, as things continue to grow there is going
to be the autonomy crisis, where managers
can no longer do everything themselves, team
leads can no longer do everything themselves.
They have to start delegating to other people
and that's going to be kind of hard, when
something is yours and it's your baby, you
don't necessarily want to give it up and let
somebody else touch it.
This is where collaboration can be even more
important h because as there is less autonomy
among individuals, fewer people who are able
to do everything themselves, people have to
start being able to really work together.
Next we have growth through delegation: As
the managers and leaders start to delegate
more and more work to other people.
At this point top-level management in the
company is usually looking at the big-picture
strategic stuff and lower down stuff that
is being delegated is more day to day and
tactical.
At this point, there can be the control crisis,
where higher level managers are delegating
more and more and they feel like they don't
have as much control as they used to.
I mentioned when I was talking about decline,
that that can happen when people start to
focus more on their own individual goals,
rather than the shared organizational goals,
and this can kind of happen when you have
this control crisis when people start to feel
like they're losing control and to try to
avoid that they focus on their own goals and
their own power and their own agenda.
So that's something to watch out for.
Organizations can really start to feel affinity
here as now we have more and more teams, more
structure and those different teams are going
to be working together more closely and hopefully
more effectively.
Next is growth through coordination.
[laughter]
As you have all of these different teams,
maybe there's been a few reorg's to get to
this point, these different teams and different
groups have to start to coordinate and work
together.
The problem you can run into here is the red
tape crisis where different teams and different
groups will each have their own way of doing
things and their own tools and you have to
fill out like certainly different forms in
triplicate to get something done.
This is where tool use an, either effective
or not, can make itself known because if you
have all of these different processes trying
to coordinate and different tools that each
team is trying to use, it can really get in
the way of get stuff done, but if you can
streamline that and make it more effective,
then you can get through this red tape crisis.
So the last phase is growth through collaboration
both at an individual level and team or organizational
levels, as well.
This is generally a pretty good place to be,
when you're collaborating, you've got all
these different groups working together, helping
to make the organization really hit its stride
and really be where it needs to be.
So the thing that Larry realized later is
that organizations can hit a point here that
he called the internal growth crisis where
all this collaboration is happening, it's
awesome, but you've reached the limits of
what you as an individual organization, as
an individual business, can accomplish.
And so what he later described as a last phase
is growth through external collaboration.
External to the organization.
Now, traditionally, this might look like a
merger or an acquisition, you bring in some
other org and take all of their creativity
and their people and this, that, and the other,
but I think this can also look like the principles
that, like, started the devops movement in
the first place.
The idea of working together and sharing our
stories.
The original DevOpsDays was about sharing
stories so that people could learn from each
other, learn from each other's mistakes and
avoid repeating those mistakes.
I mean this is kind of why we go to conferences
like this, to share our stories and to share
our knowledge with each other, and so this
sort of external collaboration can help companies
go, instead of into decline, into a phase
of renewal if they get this outside energy.
So we know that collaboration, whether it's
internal or external is kind of where we want
to get to, but what does that look like in
practice?
It's one thing for me to stand up here and
say, yes, collaboration is awesome, you should
all go do that, go collaborate.
But how do we do that?
How do we actually collaborate in the real
world?
So I'm going to talk now a little bit about
how we approach scaling the engineering organization
at Etsy.
I started at Etsy just about two and a half
years ago, and at that point we had 200 engineers,
these days we're close to about 300 so there's
been a fair amount of growth and with that
has come some challenges.
I'm going to talk through a more recent challenge
that we've had related to my team.
The state of devops report is a report that's
been put out for the past few years, it was
originally done by puppet labs, Jean Kim on,
Nicole fors gren basically put out some math
to prove that devops is cool and you should
all be doing.
They said 
high performance is achievable if you architect
with testability and deployability in mind.
Now, they were talking about writing software
and deploying software, but I think the same
thing can actually hold true for organizational
growth, as well.
So at Etsy we are really big at deployability
or testability.
So we're really into that when it comes to
our code, but I think we can apply these same
principles, like said, to organizational growth.
So testability, this obviously means something
fairly different for code than it does for
people.
Because with people you cannot write unit
tests, you can't write unit tests for a team
and run them against the team and go oh, that's
clearly a bug.
Wish you could, that would I can ma things
a lot easier sometimes, but you can't.
When I talk about testability when it comes
to organizational growth I think what it means
is having a hypothesis that you can test,
which really means identifying the problem
that you are pea trying to solve.
You're not just going to run into one problem
in the life cycle of your organization.
There's going to be multiple ones and they're
not all going to look alike.
Not every organization is going to run into
these five different phases and the five different
challenges in the same way or at the same
time.
So you have to define the problem that you're
trying to solve.
>> So last spring, we ran into a problem where
we found that we were having a lack of coverage
in a couple areas adjacent to our web operations
team.
We found that monitoring and infrastructure
provisioning, we have our own data centers,
we're not in the cloud like all those cool
kids.
You can still do devops and not be in the
cloud.
We do it.
We found that these areas were not really
being given enough coverage so things were
getting missed that we didn't really want
to be missed and we decided that this was
something that needed to be addressed.
Our hypothesis was that if we created dedicated
teams to these two areas, a team for monitoring,
and a team for provisioning, that this would,
you know, ensure that they got treated as
high priority.
Because instead of engineers trying to work
on these things, one off or in their spare
time, now there would be a whole team whose
job it is to fix these specific things.
Now, this is not me standing up here saying
that you should go create a dedicated monitoring
team, because that is not everyone's particular
problem right now.
What I do want to do is encourage people to
think about different models that you can
use when you're talking about engineering
support.
Every engineering org at some point is going
to have these different support functions,
like operations, like QA, like developer tools
that are going to be supporting the rest of
engineering org in some ways.
There are different ways you can do this and
which model you choose is going to depend
kind of on which problem you're trying to
solve so a dedicated team can help with areas
like you're not getting enough focus on specific
areas.
If you have engineers embedded in other teams
or if you have a designated model.
We used to do designated ops, that can help
with knowledge base and building relationships
between teams.
Now, just like you wouldn't push a big change
into production without doing some sort of
code review, you don't want to make some big
organizational change without having people
review it, as well.
This is about collaboration and affinity,
this is making sure that changes that are
going to impact lots and lots of people don't
get made in a vacuum.
When we were creating these two new teams
we had various people throughout the organization
review them.
We had directors, we had managers, we had
senior individual contributors, team leads,
working together to make sure that this was
the solution that we thought made the most
sense, that would best solve the problem that
we were trying to address.
Deployability, you have to be able to deploy
changes in a straightforward manner.
You don't want it to be a month-long process
where some ops engineer has to sit in the
data center like moving discs around for weeks
at a time.
That used to be a thing.
That was no fun.
You don't want to do the same thing for organizational
change, as well.
When I was at HP a decade ago, they were going
through a challenging phase where there were
layoffs and management said hey, so some people
are going to get laid off, we know who, we're
not going to tell you who are going to be
affected for six months.
Those were six agonizing months.
Don't do that.
Don't necessarily, you know, rush things out
the door without taking time to review them,
but being able to deploy these changes quickly
is important.
Iterability is really important, too.
So just as like you want to be able to roll
a bug fix out really quickly, if you discover
that you've deployed a bug in your code, you
want to be able to do the same thing with
organizational change.
So when my team, the monitoring team, was
first created, the new manager of that team
ended up going on sabbatical relatively soon
after the team had just started which left
the team without a manager.
Now, normally we love our sabbatical program,
it allows people to go take a break, recharge
and come back with new energy, yeah, that's
really great, but a brand new team not having
a manager all of a sudden is not a super-great
position to be in so wet to iterate on that
and get the team to where it needed to be
at that point.
Finally it's important to be able to look
at this sort of organizational change blamelessly,
we do this with our technical problems, you
know, we have three-armed sweater award that
we talk about that is given as a genuine award,
it's not like a mean thing every year to the
engineer who broke the site in the most entertaining
way or the way that surprised us the most.
I actually won it last year for accidentally
upgrading Apache everywhere.
That was fun.
It's hanging above my desk right now.
The point of this award is to emphasize that
we want to be learning from things that didn't
go as expected and we want to be able to do
this with organizational change, as well as
with, you know, technical issues.
So as Jessiee was saying earlier there's lots
of ways that you can approach learning from
things that have happened, and it do really
have to be about learning, this is the key
point.
It can't be about blaming and shaming and
well, you know, we didn't say, you know, you
should have known that that manager was going
to go on sabbatical, how dare you.
And we said, well, how can we learn from this?
This lesson isn't going to be applicable to
everyone in every single situation, but it
is important to think about the sharing of
stories and the learning that we can have
when we think about these things.
Overall, there really is no one growth model
that every engineering org can use.
There is no one size fits all solution.
Unfortunately I cannot stand up here and be
like, hey, I have the one answer: It is to
go create a dedicated monitoring team, or
anything like that, though it is pretty cool.
There is no one answer, because the strategies
that you're going to use at any given point
in time depend on your organization.
Different organizations are going to prioritize
different things that other organizations
might not.
Different teams are going to have different
ways of addressing these challenges as they
come internally and as people who are leading
teams, you get to choose, you get to help
guide the direction and the focus that you're
going to have.
There are some common themes that we can pull
out, though.
We can think about testability, making sure
that you know what problem you're trying to
solve, because if you just go in and you say,
well, I need to reorg, because somebody on
a stage said that we need to change and grow
the organization, you're not going to have
a good time.
You have to be able to figure out what exactly
it is that you're trying to accomplish.
Deployability is important, the ability to
make these changes and to roll them out in
the most effective manner.
A lot of this has to do with the more quickly
changes are made and how they get communicated.
People don't like to feel that there is somebody
on high saying, I have changed your entire
team, deal with it.
People like to feel informed on what's going
on in their team.
You want to be able to iterate on this.
Because challenges are going to keep coming
up and you can't just make one organizational
change once and be done with it and it's important
to keep in mind that you want to have blamelessness,
and you want to have this focus on learning.
And this brings us back to the four pillars
of effective devops, we have collaboration,
affinity, tools and scaling.
When you're thinking about collaboration,
think about how team changes might, m, excuse
me, impact this.
There's going to be different goals, different
values, and different working styles as teams
change, as they have different members on
them.
When you're thinking about affinity and you're
thinking about these support models, keep
in mind the problems that you're trying to
solve, because not every problem is going
to be applicable, not every solution is going
to be applicable in every situation.
New teams are going to have a different understanding
of the tools that they use.
And what tools are most important to them.
So you're going to need to adjust and iterate
when it comes to how people are using the
tools that you have within your org.
And again, you're not going to be done with
this.
You're going to have to keep changing and
iterating and growing.
Because orgs begin to grow and change.
They are growing, living things.
So throughout the life cycle it's important
to keep focusing on the problems, identifying
problems, figuring out what you're trying
to accomplish, just like you do with your
code.
If you focus on this continuous learning and
continuous problem solving, you're going to
be enabling your organizations to continue
solving problems for yourselves, for your
engineers, and for your customers.
Thank you.
[applause]
