[MUSIC]
Corissa Koopmans: Welcome to the Azure AD Architecture
Deep Dive Series. I am Corissa Koopmans and I am a
Program Manager on the Azure AD Engineering team at
Microsoft.
Ramiro Calderon: Hello. My name’s Ramiro Calderon. I’m
also Program Manager in the Azure Active Directory
Engineering team.
Corissa: We are part of the Customer Experience Program
and we help enterprises and businesses from all over the
world to deploy our services and get to the cloud.
We get a lot of questions about how Azure AD works
under the hood, which we will share with you throughout
this architecture series.
In this first video, we’re going to cover password hash sync,
or as you may hear us call it PHS. So, Ramiro, my first
question is why does Microsoft motivate its customers to
enable PHS?
Ramiro: Well, there are a couple of important benefits of
our customers when they enable password hash sync.
So, why don’t we walk through the flows and we’ll discuss
the benefits as we go through them. What do you think?
Corissa: Let’s do this.
Ramiro: Alright. So, the first thing to know is that
password hash sync has two scenarios. One is the actual
synchronization of the password hashes, and the second
part is how to use those hashes to authenticate. You can
only turn on synchronization, and in fact that’s what a lot
of our customers do when they are federated; they just
do that at the beginning and then they feel comfortable
about it and eventually migrate. So, let’s talk about the
synchronization today.
The box here is the machine running on-premises called
Azure AD Connect. And one thing that is worth calling here
is that Azure AD Connect is an umbrella of components that
are installed. And one of them happens to be the synchronization
engine. So, let’s go through the steps.
The sync engine, the first thing it does is send a replication
request using the same Active Directory replication protocol
to the domain controller.
Corissa: So, a sync engine is like a domain controller.
Shouldn’t we protect it like a DC then?
Ramiro: Exactly. The way that we ask customers is to treat
this infrastructure the same way as a domain controller.
Now, for step two, then the domain controller responds.
And the replication payload with the changes,
all the passwords comes back in the original Active Directory
encryption. So, remember, this is the same protocol that
domain controllers use to talk to each other. So, payload
contains the passwords in MD4 format.
Corissa: Ramiro, let’s unpack a little bit what a hash is.
Ramiro: Sure. Hashing is a technique that scrambles
a piece of data and generates another value of a fixed
size. The idea is that hashing is one way; that means that
the output can be used to derive the original value.
And this is useful for passwords because the system stores
a hash instead of a clear text password. And the users
once they try to authenticate, the computer applies the hash
to the password as the user types it, and it can validate
at the same, so it’s very cool.
Now, let’s look at step three. AD stores a password hash using
MD4 and that’s a hashing function that has been around for
a long time. The AD Connect agent rehashes this value
with modern crypto. First, we take that the 16-bit
or unit value and we adjust it, so it’s 64 bytes. Then the
agent creates a random 10 byte salt for each AD hash
and appends it to the original value. Salts make each value
unique even if two users have the exact same password
and that helps defending against attacks such as
rainbow tables. Finally, we hash the resulting value
using a more robust hashing function that’s called
Password Key Derivation Function 2, also known as PBKDF2,
and we used 1000 iterations of HMAC SHA 256. All
this process gives us a 32-byte value.
Step four, the Azure AD Connect engine sends data to the
cloud. One thing that’s worth noting here is that when we
say Azure AD, there are a lot of back end components that
enable different scenarios. Here, we happen to have this
blue box that’s the back end service that talks to Azure AD
Connect. Now, the sync engine concatenates a 32-bit value
that we just saw before, the per user salt, and the number
of iterations, and sends that new value over TLS to the back
end service. Then, the Azure AD Connect back end saves
the value you’ve received in the core store over here in
step number 5. The core store is assigned to encrypt at
rest the specific attribute that holds the credential and
on top of that, the machines here around this core store
component, they are also protected with this level encryption
using Bit Locker.
Corissa: So, the sync happens every 30 minutes by default.
Does this mean that our customers have to wait 30 minutes
for a password update?
Ramiro: That’s a good clarification, Corissa. The entire flow
runs every two minutes.
Corissa: How does password hash sync help us detect threats?
Ramiro: Well, once a customer’s enabled PHS, we can detect
leaked credentials, and here’s how it works. When bad actors
get a hold of legit username password combinations, they
usually share those credentials in the dark web,
Pastebin, or similar sites. Microsoft monitors
multiple sources to get a hold of those credentials. And
this includes partnering with researchers, law enforcement,
and other teams within Microsoft. So, when we get a feed
with compromised credentials, the system looks up if there
are matches in our customers by applying this same
hashing that we discussed in step three to the raw data.
Unfortunately, we as humans tend to reuse the same
passwords across multiple sites, so it is very likely that
compromised passwords are also the passwords that are
used in Azure AD. And if there are any matches, this means
that the user is compromised. Azure AD generates a risk
detection in the customer’s tenant, so administrators can
take action and correct. Moreover, customers using Azure
AD Identity Protection can go further and automate the
response by creating policies that indicate the users that
they have to reset their passwords right away.
Corissa: Got it. So, this is what I’m taking away from my
customers. Password hash sync can be thought of in two
ways, as a synchronization method and a form of a
authentication. Customers can synchronize first and then
decide later that they want to authenticate. This feature is
great for all Active Directory customers regardless of the
type of authentication method that they are using,
whether it’s federation, PHS, or PTA. And last, PHS adds
a layer of security because it detects leaked credentials.
Ramiro: You got it.
Corissa: We hope you found this video useful. We’ll be
adding videos on different topics like authentication,
provisioning, governance, and many more. If you want
to get a copy of the diagrams we use today or want to
give us feedback and help us figure out what to present
in the future, please follow the link on the screen.
[MUSIC]
