(upbeat music)
>> Hello, everyone, thank you for joining.
I'm very excited to be part
of DockerCon this year.
Let me start with a quick introduction.
My name is Ashish Sharma.
I work for SS&C Eze as Director
of software engineering.
SS&C Eze is a Boston
based fintech company.
We provide products and services
in the stock market domain
to automate and streamline the
entire investment lifecycle.
Quick stats about the company.
We are in the business for over 25 years,
serving 1900 asset managers worldwide.
As a company, we invest heavily into R&D.
we have 1,050 plus employees,
with offices located across the globe.
Before jumping on the main
topic, I quickly want to
talk about some of the core offerings
to give everyone more
context as to what we do.
Eze Eclipse is our online
investment management platform,
a typical by set
investment management firm
like a hedge fund or a mutual fund.
They are constantly running
through this investment
management lifecycle
represented by the circle,
using the capabilities of our platform.
And due to the nature of this business,
we place a very high
emphasis on high resiliency,
high throughput and low latency.
There are a lot of different parts
in the investment lifecycle,
I will briefly touch
upon the very, very core
ones, starting with trading.
This is where the active
stock market trading happens.
And this is a very interesting
engineering problem
for two reasons.
First, the stock market
in general is a vast
and complex domain and second
its execution has to be
very, very performant.
Our user can decide, okay,
you know, today I want to buy,
let's say, 100,000 shares of Google.
And he can enter this
information in our application,
we take that order, we
route it to the brokers.
And then in real time,
we show you the progress
on your on your order.
It needs to be reliable,
it needs to be performant.
Even a sub second delay in
routing out orders to the market,
can cause a lot of money to our clients.
Analytics. This is real time analytics
like a real time view
of your profit and loss.
Our users wants to know how much money
they're making or losing as
the market is progressing.
So to put it in the
context, think about this.
The price on the security
can change multiple times
in a second, multiply that
with all the securities
in the market and very
soon, this becomes kind of
a very big data set to deal with.
Very interesting problem to solve.
Compliance, we provide
real time compliance.
You can create rules like,
no matter what my exposure
in technology sector
should not go below 20%.
So we'll run these
compliance rules for you
to make sure you're kind of complying
with your investment strategies.
Modeling. This is useful
for idea generation,
you can run large financial models
or you can create
hypothetical scenarios like
hypothetically, let's say I want to invest
$5 million in pharmaceutical sector.
So, how does decision would
shaped my overall portfolio?
So, some very interesting models
and scenarios you can run.
Operations. Start of day,
end of day operations,
we interact with hundreds
of different third parties
on a daily basis.
And each of these third parties
have a variety of different
formats in which they like
to send and receive data.
The fun part about this problem space
is to build like sophisticated
ETL and data pipeline
to handle all these variations
and make it seamless.
Accounting. This is a
book-keeping of all the decisions
you've ever made and
in time series fashion.
Coming on the main topic, in this talk,
I want to share some elements
of the journey we took
from monolithic to
microservices architecture
and how Docker played a critical role
into making this journey a success.
Microservices and Docker
is a powerful combination.
I will share with you how
it impacted our tech stack,
developer experience, testing
strategies and delivery.
And as I present this to
you from my home office,
there's no question that we are living
in this unprecedented times.
The spread of COVID-19
has affected the world
in so many different ways
and one of these has
been the stock market.
During the very early days of pandemic,
trading volumes among
our clients was higher
than we have ever recorded as a company.
There's no doubt in my mind that
had we not undertaken this
journey, the Eclipse platform
would not have been
able to seamlessly scale
to meet this unprecedented load.
So, let's see how adoption
of microservices and Docker
has shaped our tech stack.
Our journey started with simple idea.
We wanted to build an online cloud-native
investment platform.
And when you are in this
very, very early phase
of the adoption curve,
the one on the right,
all you have is an idea.
And you're working to see if this idea
has any kind of potential market.
So at that point,
it makes sense to keep
your tech stack simple.
The very first version of the tech stack
looks something like this.
Our front end is a single page application
written in Angular, we had around 10 to 12
back end services, all written in C#,
deployed as Windows services.
These services communicated
to each other over RabbitMQ
and we use Microsoft SQL
Server for back end storage.
So overall, we kept it very simple.
And this this served as
well for a short time.
But as we added more and more features,
it became a monolithic system over time,
it got to a point where we
were experiencing
difficulty scaling it up.
It required coordination across the teams
during deployments.
It impacted over time to
recovery when things went wrong.
There was no proper fault isolation.
So in a way,
we experienced our own
share of traditional issues
that we see with the
monolithic architecture.
And we knew we had to
do something about it.
At the time, microservice architecture
was gaining attention in
the development community
and it had a lot of promises.
And on the same time, some folks
from the from the organization
also attended DockerCon
and they got excited about using Docker
after they come back.
So we applied both.
And today the tech stack
looks something like this.
We have we have built polyglot stacks
with loosely coupled services,
written in different languages,
preferred languages are
Node.js, Golang, Python, and .NET Core.
But we encourage your teams
to explore and experiment
with different tech stacks
to solve different problems.
Reason being, for example,
if you're building
an ETL pipeline and using a language
which doesn't have a rich
ecosystem for data manipulation,
you would be burning a
lot of development cycles
trying to fit it in.
Docker is a big part of our ecosystem,
all services and infrastructure components
they run as Docker containers.
We have 140 plus distinct
services running in prod,
with more coming up
every week, every month.
The messaging infrastructure
is now powered
by RabbitMQ and NGINX
console and Amazon Kinesis.
We have extended our back
end storage to include MySQL
and we heavily rely on Redis for caching.
This is the current state of tech stack
and it's continuously evolving.
So wait, now I have to worry
about microservices
and polyglot stack bot?
Adoption of microservices
architecture alone
is a steep learning curve for everyone.
No question about that, right?
You have to worry about
decomposition patterns,
data sharding, API boundaries,
service to service
communication, there are
a lot of different things
that you have to worry about.
Polyglot stacks comes with
their own set of problems.
There are too many options to choose from.
In terms of languages and frameworks.
How do you even manage an infrastructure
with so many heterogeneous
elements without going insane?
A word of caution about polyglot stacks.
Maintaining a true
polyglot stack is difficult
and it requires a strict discipline.
Our own internal development,
culture and discipline
has evolved over time to put us in a spot
where we can better manage
against potential pitfalls,
but it requires very strict discipline.
I will talk about how we
manage Eclipse infrastructure
with polyglot stacks, uniform
packaging, all our services
are packaged as Docker
containers, heterogeneous services
in a uniform package makes
it a much, much easier
to operate and reason about the system.
It's a huge relief to
our infrastructure team.
With uniform packaging it
makes environment definition
file simple so, then
it's much, much easier
to create consistent environments
you can reliably spin up
like a stagging, test, performance,
and those environments
look exactly identical
to your product moments
on a click of a button.
Dynamic scaling, once you have environment
which is simple and consistent,
it further makes it easy
to dynamically scale it up and down.
Currently in Eclipse
through support scaling
based on resource consumption.
So say, if your CPU goes
above certain threshold,
the auto scale are based on time,
by looking at the service usage patterns.
And this has been very
useful in the recent times.
With the with the COVID-19
impact on the stock market,
our system was able to
dynamically scale up just fine.
It also simplified our CI/CD tool chain.
In the monolithic days,
in the absence of Docker,
we had like 10 different tools for
building, packaging, deployments.
Now it's all simplified,
with just using Jenkins as a Docker CI.
Another element that helps
you manage the complexity is
consistent adherence to standards.
Standards are important.
It helps you keep checks
and boundaries in place
for teams to operate in
a unified and safe way.
And it's easy to do in
monolithic architecture
and mono stack, right?
You just have one language to worry about.
And all the components coexist together
so you don't have to worry too much
about like APA boundaries
and things like that.
How do you make standards
in polyglot stack?
We met in our own set
of internal standards,
which are agnostic to any
framework or language.
And they are held and
enforced at higher level.
Here are a few categories,
resource access, these are
the things that happens
within your service, like
accessing a database or a cache
authentication, authorization.
How do you make sure teams teams
are using consistent
patterns across the board?
So to solve this,
we have written service
templates in all four languages.
And they come with this
predefined patterns
to address all these category of concerns.
So for example, if there are
like let's say 30 services
in Node.js, they all will
be using the same template
and they remain consistent like that.
API boundaries, since
there is no binary sharing
in microservices, services
aren't talking to each other
or say rest or GraphQL this
becomes very important.
Your API needs to look and feel consistent
for both internal and external consumers.
We solve this by creating
a set of API standards,
internal API standards that
all services have to follow.
We use Swagger as a as
an API documentation tool
to give that nice look and feel.
And we also embrace
contract driven development.
And this is a must have discipline.
Workflow can span multiple
services on different teams,
and those different teams
may use different languages.
So that makes contract
contract driven development
a must have practice.
Bootstrapping, how is
services get configurations
and secrets when it starts?
Well, we solve this problem by extracting
this responsibility completely
out of the service event.
So when a service starts,
it can expect all the
configurations and secrets
to be available to it in
terms of environment variables
and Docker made this pathway simple.
So all these measures come
together to ensure consistency
between polyglot services
and allow developers
to kind of focus more
on the business logic.
Let's take a look at how adoption
of microservices architecture and Docker
has enabled better developer experience.
And to understand this
better, I will start with
how other developer experience used to be
back in the monolithic
days, then I will show you
how the delivery experience looks today
and how this has been
the life changing for us.
So, in monolithic days, let's
say I want to work on a story.
So my developer experience
would be something like this,
come to office, drink my coffee,
the most important thing
for me in the morning,
download the code from the repo, build it,
I will probably drink another coffee,
then leads to a database
locally, register services,
configuration, and this is a big step.
So I might end up drinking
three more coffees
when I'm doing this.
Then it would be lunchtime
and hopefully after lunch
maybe I can work on a story.
I personally have spent hours and hours
setting up my development environment.
And I took it even further.
I would not even download the latest code.
I would work off the
old version of the code
for as long as I absolutely don't have to.
And working with the non latest code
actually came with its
own set of problems.
It resulted in more issues when
they finally committed to the mainline.
Today, it's way different.
Like I said earlier, we
have 140 plus microservices
and we didn't want developers
to run all 140 services
locally on their system.
So this forces to kind of
rethink developer experience.
And we came up with a hybrid approach.
Now a developer only runs
the service he wants to run.
He wants to work on locally on his laptop
and everything else, the
UI, the back end services,
the infrastructure
components, they all come
from a shared development environment.
So the developer experience
today looks something like this.
I come to office, start
drinking my coffee,
download the code that I want to work on.
Then I run docker-compose
up and that's it.
I have a working system.
And at that point,
I'm probably still
drinking my first coffee.
And this is huge.
I have a working local
environment in just few minutes.
Earlier, I would refuse to
download the latest code.
But today, the first thing I
do is to get the latest code.
And getting a simple local environment
only takes few minutes.
No more no more working with
the old version of the code.
So let's let's take a look of how this
hybrid service development
approach works in practice.
In this demo, I will show you
how we set up the local service instance
and use console to
register legitimate service
to a shared development environment.
Okay.
This is one of the microservices
it's written in Node.js.
This service handles
the lifecycle of a stock
that our users can
borrow for short trading.
It's one of the creating strategies.
Here's a docker-compose.
This is the local console
server I was talking about.
And here's the service that
I want to work on locally.
I want to spin this up locally.
And this is the database
that comes with the service.
so here I have Docker
for desktop dashboard,
there are no containers
running at this time.
So now let's go ahead and
run docker-compose up.
So as you can see on the
right, it started my stack
with all three containers, the console,
the local service and the database.
So now what I can do is
I can go to my webpage,
which is hosted in a shared
development environment.
I need to log in as myself.
So this is...
Again, this UI is coming
from a shared environment.
And I need to go to Trading and LOCATES.
Okay, so now what I'm trying to do is,
I'm going to make some UI edits.
That should generate a request.
And my hope is that this request,
should reach to my service
running locally on my laptop.
And we'll see that in the logs.
Let's see here.
I'm seeing some logs here already.
Let me search here for Google.
Okay, here perfect.
Here's a request that I made from the UI.
Great.
I have a local working system
in just few minutes now.
And this makes it very easy
and straightforward to develop
a service in isolation.
And also at the same time,
working with the full
and latest context of the
entire running application.
This has been life changing.
We also migrated from mono repo,
which is like a one gigantic single repo
with all the code to micro
repos, where each repo
represents a standalone
service or a UI component.
So when a developer wants
to work on a service,
he's only downloading the code for
that service and nothing else.
Docker further makes it very easy
to write self contained
code, meaning the code comes
with its own explicit set of dependencies
and it knows how to go and get them.
It comes with its own SDK
to build and run tests.
It knows how to package itself
in terms of deployment unit,
which is a container and we
get all these nice benefits
without really relying on any
of the system wide packages,
all you need really is Docker
for desktop and any laptop.
And it's immutable package.
So it's not going to change between
your dev and test and UAT and PROD.
Gone are the days when
we used to find issues
because of a mismatch in the dependencies
between your local and
any other environment.
It further makes onboarding simple.
So when your code is small,
it has a well defined responsibility.
It is self contained and spinning up
a local dev environment
only takes few minutes.
It makes onboarding a new
developer much, much easier.
Another great benefit we
see is a massive trend
of cross team PRs.
So let's say my service consumes an API,
and this API is owned by a different team.
And if I need like few
more fields from this API.
So what I will do is I
will download the repo,
and I will see what
patterns they are using.
And I will quickly create a PR myself,
all they have to do is just review it.
Here's a chart based on our internal data.
This is a weekly pull request graph.
Team A here is making somewhere
between say 15 to 20 PRs in a week,
and then there's this sudden drop here.
It actually turn out that
Team A was busy making PRs
into Team B's repo so we see
a lot of patterns like this.
Okay, so how adoption of microservices
and Docker has impacted
our testing strategies.
After we decided to move towards
microservice architecture
for our mainstream development practice,
we knew testing was
going to be challenging.
As we added more and more services,
we observed an interesting trend.
The teams were relying more and more on
the integration tests to test the services
which was not agreed.
Too many integration tests can slow down
your delivery pipeline.
And they can be very flaky,
meaning they may fail
for a very non obvious reasons.
So we had to shift left
and think about how we
want to test our services.
We enabled our teams to
develop certain services
in isolation, as you
saw in the previous demo
and we wanted to do the same for testing.
How can we test our services in isolation
with more confidence.
So using some of the microservice
architecture principles
and Docker enabled us
to do exactly the same.
I will show you how.
The diagram on the left,
you can think of it as a service.
This service exposes
multiple resources as an API
like resource one, two (mumbles) here.
And its implementation
is deep layer of text all
the way to the database.
The small boxes here
represents unit tests.
We take a small unit of code
and we test it thoroughly.
But unit test alone doesn't
give us enough confidence
that when all these units coming together,
they are going to work as expected
and that's why we need service test.
To test the depth of the service.
The long, pink boxes here
can be seen as service tests.
Meaning if I call the
resource one API from outside,
it will exercise all the
layers, including the database
and produce an expected result.
We all love unit tests, right?
They're fast, they're
easy they're consistent,
they clean up after each run.
And they produce this nice
code coverage reports,
which gives us a lot of
confidence about our code.
Wouldn't it be nice if service tests
were as reliable as and
as easy as unit tests?
So what are the challenges
with service tests?
Setup dependent services and test data.
Do I need to run all the services
that my service depends on?
Maybe those services
depend on more services.
So it makes a deep graph of services.
And do I need to run all the databases
that these services use?
So building this environment
with deep graph of services,
it's expensive, right?
And it's unstable, it's going to be flaky.
And even if I managed to do all this,
I know it will be very slow to execute
and I practically cannot
trigger them on every check in.
So let's see how microservice architecture
and Docker helps with these points.
Starting with dependent services.
When you're operating in a
microservice architecture,
as a service, you really
care about two things,
your immediate consumers and
your immediate dependencies
and nothing else, no layers beyond that.
Just your immediate dependencies.
Using this to your advantage, we created
a simple, mocking
framework that was inspired
by the web crawler technique.
So as a service, I know
all the APIs I depend on
upfront like and all the
APIs, I'm going to call
when I execute any workload.
So now what I can do
is, I can actually tell
this mocking framework
about all these APIs
and this mocking framework, can further,
can go to these APIs as a like a crawler
and download the data from those APIs
and save them in a JSON snapshot.
And later during the test run,
these snapshots can be served as mock.
So, my service cannot
really tell the difference
if the data is coming from a real service
or if it's coming from mock service.
So this solves our problem
of deep graph of services.
The next step is to build
a reliable test suite.
Our service test suite
looks something like this.
There are four containers,
the service containers
in the middle, the one we want to test,
the database container that
comes with the service.
It has a mock container, which is going
to serve the JSON files or (mumbles)
the files that we downloaded
with using the crawler.
And it has the test client container
which is going to execute
tests against the service
by invoking API calls.
Okay, so let's see how
this works in action.
In this demo, I will show you
the crawler based mocking
technique to download
the snapshot of the immediate dependencies
and then I will run a set of service test.
Okay.
So, here is the same service again.
I need to find crawlers
so that I can test BDDs
So I want to show you API mappings.
Okay, so here they are.
So these are the APIs that
my service depends on.
I know them upfront so
I can add them here.
What this framework will do,
is it will go over these APIs
and download the data in JSON files.
Okay, so let me run the download tool now.
So what it's doing behind
the scenes is it's going to
one of the shared environments
and looping over all those
APIs and it's going to download
these files that I will
show you after it's done.
So, let's see here.
Okay, here's a whole list of
files that are downloaded.
The activity of downloading the snapshot
is kind of independent
of running the tests.
The frequency of download
really depends on the
stability of the dependent API.
Like if it is a brand new integration,
then you might want to
take frequent snapshots
otherwise you can take it
like maybe once a month.
Okay, so now, let's
spin up the test suite.
So what it will do, is it will
create the three containers,
one the service itself,
the mock container.
This mock container will now serve
the JSON files that we just downloaded.
And it will serve it as mock
and the database container
and the service container.
Now let's run the test container,
the BDD test container.
Okay, so now you can see on the right
that it started at test client container.
This client container is
making request to the service,
HTTP request to the service.
The service is then talking to the mock
in order to process those
requests of the database.
And then it's returning result
back to the test container,
the client test client container
and test client containers further kind of
validating the results on top of it.
Okay, so it's all...
It's done.
Couple of things to notice
here, let me scroll up.
Okay.
It ran 108 tests in 20 seconds.
Super fast, right?
And then it creates, it generates
this nice code coverage report.
So it literally innovation, it is possible
to get code coverage for
all running services.
So what we have been able to do just now
is to test the full depth
of the service in isolation.
And the characteristics of these tests
are in line with the unit test.
They're easy to write, the
adoption of this style of
service tests has been impressive.
Almost all the teams are doing it now.
They are fast since everything is local.
And as we saw it ran
108 tests in 20 seconds.
So not as fast as unit
test but relatively fast.
They are consistent, that
means they only break
for real reasons and not
because of some flakiness
in the environment.
Code coverage, with some innovation,
it is possible to get code coverage
from a running service
as you saw in the demo.
The immutability aspect
of Docker container
makes cleanup super easy.
Just restart the containers
after you're done.
And they trigger on every check in,
so our Jenkins CI pipelines
are set up in such a way
that it executes this service
tests upon every check in.
And the same service test suite
can be further used for
other style of tests.
Here's an example of how we run
chaos test within the same setup.
So this is a chaos scenario
where we want to calculate
the profit and loss based
on the start of the price.
If the real time API
fails to return the data.
The way to do it is kind of by hijacking
one of the standard HTTP headers
and tag it with some unique value.
In this case, price returns 500.
When the service receives this request,
it doesn't do anything with
these headers, all it has to do
is to make sure it should
pass all these headers
to the downstream calls
the mock in this case.
Now the mark is a little
bit more intelligent.
And when it sees this
special tag, it will return
500 status code instead of
returning the real result.
And that would make the
service kind of fall back
into this chaos mode.
In this case, use the start up the price.
So in this way, we are able to test
the resiliency of the system in isolation.
And this same setup can be used
for other different types of test.
After the adoption of
microservices and Docker ecosystem,
it completely changed the
way we deliver software.
And to appreciate this,
I will show how we used to deliver
back in the monolithic days
and compare it with how
we deliver it today.
So back in monolithic days we
had weekend only deployments
and a dedicated team to do it.
It was a painful process.
A dev from each team would be on call
in case something goes wrong.
It required extended
downtime and the roll backs
were very, very painful.
And since it was like
weekend only deployment,
there were a lot of changes
that are going into it.
So if there was a problem,
we had to roll back all the changes.
Today, we have daily anytime deployments,
we have empowered our development teams
to push the code to production.
They are deploying multiple times in a day
and all deployments are zero downtime.
And the rollbacks are much easier now
just only roll back the
problematic service.
Here's a quick stat on
the deployment history.
We deployed 554 times in last 60 days,
with moving average of 10
to 12 deployments a day
and max reaching to 21-22 deployments.
Those long peaks there.
The gaps here you see are
probably the weekends,
so we are not deploying
on weekends anymore.
I will quickly touch upon
patterns for safe delivery.
And I choose these two,
the dark launch and the
feature toggle for two reasons.
First, if you are in
the journey of migrating
from monolithic to microservices,
these two can be really real good assets.
And second, they don't require
any sophisticated setup,
you can just do it with simple setup.
Dark launch is a practice where...
Imagine if you have a V1 code path,
and you want to migrate to V2.
So the first thing is to clone the input
that's going into V1 and then execute both
V1 and V2 code path in parallel
and execute the V2 with the clone input.
And then you can compare the
results of V2 against the V1
for correctness and performance.
So this will allow you
to gain more confidence in your new code.
Just make sure V2 code
doesn't mutate the state of the system.
Otherwise you'll have issues.
Feature toggles.
So once you have gained enough confidence
about your V2 version of
your code, you still need
to be careful when rolling
it out in the production
so you can limit the blast
radius by using feature toggles.
So in the workflow create
a switch base of toggle.
And if the toggle is on,
run the V2 code path
otherwise continue on V1.
So in case something goes wrong,
you can immediately turn off the toggle,
it will improve your meantime
to recovery significantly.
I want to wrap this up with key takeaways.
Starting with tech stack,
microservices with polyglot
stacks are truly magical.
It can further amplify your ability
to react to changing markets,
but it requires strict
engineering discipline.
Developer experience,
invest into making your
developer experience as quick
and as smooth as possible.
Your developers will love it.
Testing, testing in microservice
world can be very hard.
shift left, think about
how to test your services in isolation.
Delivery, apply standard
patterns for safe delivery,
your focus should be to improve
your mean time to recovery
once you've achieved
this level of maturity
in the microservice architecture,
where you have polyglot stacks,
services can be developed
and tested in isolation,
teams are empowered to
push code to production.
Your software development lifecycle
will absolutely run on steroids.
Thank you.
(upbeat music)
