>> Good afternoon how are we? We
can do better than that. Maybe
get beer. So how many people
made it over to ctf? Go check it
out. Look at it. It's fun. And
we've heard a little bit ctf and
you will talk more about this?
Good stuff. Let's give these
guys a big round of applause [
applause ] >> All right. I sound
really loud. Welcome to our
talk. We are here to talk about
some work we've been doing in
the context of ctf team and
security research and in the
context of cdc and open-source.
We are here to talk about
arraign and we will get to that
later. First, we are we. I'm
Zidus and this is Fish. We are
from shellfish. We are ctf team.
We are playing the ctf right now
on the Defcon floor there. We
are fighting hard. As you will
see so this talk is full of
these live demos. We are going
to try some ctf challenges. Show
how to approach that challenge
with this framework. You will
see everything melting down
completely and a lot of fun
stuff. The point is, something
is happening on the ctf floor.
We come bad ass and prepared and
then the game starts. Any ways,
we are shellfish. And we are
also from UC Santa Barbara and
there we have an awesome
computer security lab and that's
where anger was created. And
major attributers, me and fish,
Andrew Dutcher. Amazing dude.
Nezgor, aka John. He's the
leadest one of us. And Chris,
very creative. And Chris. Pose
dock. This is the anger team. As
for me, I've been coming to
Defcon since Defcon 9. Back in
the park. It's awesome then and
now. It's been a lifelong dream
to come here speaking. And now
I'm here it's awesome to be here
in front of you guys. I'm a
student in Santa Barbara. And
I'm actually there because of
ctf. Shellfish is mostly from
Santa Barbara and I joined them
for Defcon qualifier and got
pulled in and pulled into the
lab. That was pretty cool. And I
let my fish here to introduce
himself. >> All right. You guys
can hear me. Awesome. Thanks.
I'm fish obviously. This is not
my real name. I don't think you
guys can read my real name if
you are not from china. Anyone
from china can read it. Probably
not. You guys are honest. I'm a
ph.d. student from Santa
Barbara. Super famous from
melting work. >> Super famous.
>> And I've been playing Defcon
ctf, this is my 4th year. But
I've been playing ctf for 6
years. I'm a reverser. Yesterday
I just solved my first one.
First out of four years. So I'm
very happy. So solve that
challenge I didn't really sleep
last night. And if I talk about
some none sense today...that's a
little bit about me and I will
hand it back to yan. So this is
I tin narrow today. And we might
fail completely. Why we build
it? Spoil alert releasing it.
Then we will talk about what we
designed it to do. We will talk
about the different parts of our
analysis system. We will talk
about application of it. Show
off some ctf challenge, solving
or assisting ctf challenge and
open-source release. >> Any of
you guys playing ctf right now?
Raise your hand. Couple of them.
Okay. So you don't want to leave
before the talk ends. >> Unless
you are not shellfish then you
should leave before we get to
the live example. Okay. Let's
jump into that. Why anger? Why
do we build the binary analysis
platform mostly from scratch of
using the one available out
there right now. In fact there
are tons of them out there. I
went through and see what's our
competition kind of. There are
enough to fill a fire slide.
When we started anger two years
ago there were not this many.
But now everyone staring binary
in ida. It came out 2005 or so.
Ida kind of a defect toe binary
analysis tool that everyone
uses. So, of course, our
long-term goal is to, you know,
replace ida. Working with ida
quite a lot it's sometimes
frustrating. We have moments of
why it's doing this, why it's
designed it that way. The truth
is design analysis tool is
extremely difficult. They are
nowhere near replacing ida but
there are things we do ida does
not and the way we do them there
are no other software out there
is capable. We will add one more
name on it and hopefully you
will find it useful, at least we
do. So we are pretty excited. So
let's talk about the
fundamentals of anger. What did
we design it to be like? The
idea is we are all python user
mostly in the lab. Show of
hands. Who uses python as a
primary hacking language. So
anger is written in python and
for python working on binary
that gives us flexibility and
explores analyses that are
powerful and unique way. We will
show how to you anger to quickly
script symboliks executions, the
finding of ropgadget and is so
forth. And, of course, a core
component of any modern analysis
system is that it has to work on
platform other than the nexus
86. By levering the station
called vex, we support the
64-bit and 32-bit of all major
architecture. And legitimate bs
we did spark troll twit. We
spent couple of days to hack
spark four into this and it's
almost there. It's pretty
extensible. This is what a user
using anger might go see. You
just import it. Open up an
example binary and then you will
go into all of analyses and all
of the things that anger offers
later in the demo. Anger have
several different components. A
binary loader that's general. We
can load pe files. We can't do
much with pe files yet. But we
can load them and start
executing until we hit some
environmental interactions. We
are linux binary and so forth,
and we even support [ inaudible
] so if you have dump off of iot
device, anger will tell you
where you can load in memory and
start analysis it. Fish will
talk and symboliks execution
engine and the symboliks
execution engine is capable
identifying unsafe situations
and reversing what inputs need
to be drive a program down a
spect bat. So let's dive in. We
will start with symboliks
execution. Whoa. And we are
done. [ laughter in the room ]
It's been nice talking to you
guys. I'm sure this is one of
many situations. Awesome. And
now it's no longer full screen.
Awesome. And okay. We are almost
there. Boom! [applause] Thank
you. That was the first demo.
Start with symboliks execution.
It's a sub that has been around
a little while and gaining more
prominence. I don't know if you
were at yesterday's talk on
symboliks execution on another
analysis system. This is kind of
analogous thing. What's
symboliks execution is. Answer
the question how do I trigger a
certain path or a certain
condition. So you might imagine
a binary that does something
when you give it a certain input
like crack me ctf challenge
which we will look at later. And
how would you interact with
that. Just give it input. You
say, here's a guess, is it good.
And it will tell you no. Most
likely you are not going to
guess a flag. And you can do
some sort of flag analysis. Do
ida randomly. Looking at binary
and clicking here and there. And
you can do this way, it's not
going to give you an answer
because it's not precise enough.
We will talk about status
analysis later on. Now we need
symboliks execution. We
interpret an application as we
interpret it we track constraint
symboliks variables and required
condition is trigger and see a
path that we like we conquertize
the input, the variables to
identify possible concrete
input. A quick example of this,
if you have a constraint on
symboliks x. You can do
constraints solve. Come up with
a number 42. In this case it's
super trivial and in general
it's an empty problem and it's
kind of pain in the ass. You
start constraint solve and it
will never finish. That's one of
the challenge in symboliks
execution. Let's go for example
program. Anger analysis binary.
Not python. But python is more
approachable. Bonus points if
you can catch the vulnerability
in this program by the way.
First thing we do with symboliks
execution is go line by line,
hit this input. You see blue and
execute it. That input symboliks
variable x. X is unfounded. It
does no known constraint on it.
Then we continue executing and
we will hit this branch and what
it does is it splits into two
possibilities. And so one of the
possibilities is when x was
greater than ten and that branch
was taken otherwise it's the
inverse of that x is less than
ten. And so we continue
executing. Now we have two
states. Keep that in mind. So my
frustration there's multiple
state and we will see why that's
the problem. So now we have two
states. And one branch, one
state is not done yet. And it
splits as well according to the
different possibilities. Then if
you want to answer the question
of what does it take to print
two in this scenario. In order
to print two we have our
constraint we have the state
that path made it there and we
do constraint solve and
constraint solve tell you you
can put 99 or 42 and so forth
and give that dynamically to the
program, launch it and see the
expected there. So let's do a
demo of kind of very simply
binary that have this tool
backdoor that we want to detect
with symboliks execution. This
key come is from me because I
stupidly get pulled right before
doing this and so I don't think
it works anymore. So we might
have to switch to fish's laptop.
Can everyone see that? Awesome.
So we launch it. Okay. We will
switch to fish's laptop for this
demo. This is what live demo is
all about. Python exceptions.
There you go. So fish uses
window, I know it's
embarrassing. Bare with us. >>
You know like in this kind of
situation window is never [
inaudible ] [ laughter in the
room ] >> At least it's a linux
vn. >> Anger currently support
linux. In the future it will be
run on windows. >> Allegedly. So
this is an anger management.
Anger's gui to do symboliks
analysis and static analysis. So
we will look at this tool binary
we have that's nice for testing
and explain what symboliks
execution mean and let's look at
ida first. As I said, of course,
everyone uses ida, oh, the
source code. Yeah. We do. Great.
All right. So we will just look
at it. The, this binary is a
binary that asks...we do have
the source code. All right. So
here is.... >> Is it readable?
>> Awesome. All right. Guys, if
you can't read this, then we are
screwed. So it's a very simple
binary. It has user name
password. It takes the user name
password as input. And it cause
authenticate function return one
and says you've been accepted.
Authenticate function has a
backdoor. If you pass string
compare, then you will
authenticate automatically. So
it's possible to detect this
automatically in anger
management. So here's the gui.
Over here we have the display of
what paths are currently active
in the analysis. We can run
multiply analyses at the same
time and never run just one. We
can stub these paths and we can
look at what's present currently
in their registers. Is there a
way you can scroll somehow? So
this is what's currently on the
stack. So then, we can take that
and stop it. Let's execute until
it branches. So here we have a
path that branched. And it
branched for some reason and
that reason is because there's
user that's symboliks variable
that can be compromise to
anything. And here we can
actually look at it. You can
look at user input and we see
that...fisher I can't use your
mouse. Oh, I'm touching...but
you can see that on one hand the
user can input any password and
it does one thing and if the
user input sneaky it does
another. If you look at standard
output instead and we keep
stepping, there. You'll see that
here when the user input so
sneaky it immediately trusted
him and let him in so this is an
example how symboliks execution
can help us analysis binary and
we will go into more complexed
one for ctf challenges. So
let's...there. Oh, come on.
Great. That was my temporary
Defcon password. It's gone!
Great. Yeah. Let's keep going on
yours. People tell me not to use
linux for presentation but I
don't believe them. I think it's
just fine. But it's dark magic.
So oh, well it's you. So along
with that, status analyses. >> I
just figure out has taken too
much of my time so I will keep
it simple so I will talk a
little bit it. If you are
interested you should come to my
lab and become ph.d. student. So
let's start. Before we know
binary we all need to know
control flow. The first thing
you see is a property box. You
click okay. You will see control
flow. We also do the same. In
anger management that's our gui.
We will show you the graph of
every single function very to
ida. What's the difference? It's
more accurate, more adjustment,
the result is it's much slower.
That is because we support
multiple options like contact
activity level support like
backward sleazing, et cetera. To
automatically resolve some stuff
that's hard to resolve normally
or statically. For example jump
target or virtual pointer
tables. In comparing that cfg is
faster. This is how we create
cfg in anger. First line input,
second line create, the binary
name, third line you see.cfg.
Press enter and it will give you
cfg. We want to see how many
basic blocks there are. There
are 78. So if you want a faster
cfg and you don't want to buy
ida you can check this out. It's
a fast mode of cfg generation
that doesn't do any symboliks
solving. There's also boy scott.
All right. Another static
analysis routine in anger is
value set analysis. This is a
kind of abstract interpretation.
In case some people haven't
heard of that is kind of static
analysis to execute part of the
program. There's a loop, in that
it will loop three times. Then
we figure out the semantics of
the program and execute part of
the program. So that gives us
the possibility if enumerating
the state space because we are
not executing all the program.
We are exhausting the state
space. On top of that we can
have variable recovery. And on
top that we can build memory and
type inference. Credit goes to
the author of this paper. I
tried so hard to read your name.
He's the creator of vsa value
set analysis. Here's an example
of what the value set analysis
looks like. Here's a piece of x
64 assembly. You have 5 seconds
to read it. Okay. Great. I think
if you know this, you will
understand this program. So what
is that in the yellow square?
It's symboliks execution, it
will just keep executing it. The
problem is at every of the loop
it will branch out. Zero acts 25
thousand different states. If
you are using this, rbi will do
anything. Because we are not
following every single branch.
With random analysis we can
actually tell rbx is less than
1025. Is that good enough? Try
to do better. Value set
analysis, this is one of the
type of values that the value
set analysis is using. It's
called strident intervals. A set
of number can describe in upper
bound and must strike between
each single value and their
size. So here the interval can
be computerize and be it means
nine different values. Between
zero x 100, zero x 4. That's
what stripe mean. What is rbx in
the little square? We take the
loop, rbx can be from zero to 4.
Second interaction it can be
from z to 8. And next z to --
and after the loop is not
terminating. What do we do? If
it's looping forever if we
continue? No. Rbx go to
infinite. After that zero to
infinite is not accurate enough.
We perform a narrate. It becomes
zero 1024 with that. In this
case it's pretty accurate. We
extended the original random set
analysis following two different
improvements. The first one we
name it limited related
analysis. In this case. Normal
vsa will be able to tell the
bound of rex should be 5, rcx
they don't do any relation
tracking. They don't know that.
We are doing some limited amount
of variable relation tracking
and in this case we are able to
tell rcx equals r plus 1 and rcx
36. That improvement we made our
vsa agnostic. We included
another analysis called rapt
interval analysis. The credit
goes to this guy published in
2012. With that the precision is
quickly improved. And now I'll
give back to yan, and we will
talk about application and reel
demo. >> All that technical talk
or theoretical talk maybe a
little boring but it's necessary
to get us into the actual anger
application. Here we will demo
off the thing that we do and you
can do with anger. First we will
demo off ropgadget finder.
Ropgadget or x rub that will
tell you there's the gadget and
the instructions. This tells you
what the gadget does and you can
filter it down later. And, of
course, implemented in anger and
it's super easy to use. So
here's the example. So we load
ctf binary called nuclear. I'm
not from this Defcon ctf but
different ctf in the past and we
analyze it. We want all
ropgadget find them and print
them out. So let's do that.
Because it does semantic
analysis. It's a little bit
slow. So it takes 20 seconds
maybe a little bit more for this
guy. So right now anger is
analyzing every basic block and
figuring out semantics. What
register to touch, how much
change the stack by and where it
writes to in relation to the
variation registers that it
uses. So here's an example
gadget. It's a gadget at ox 4040
c and binary changes the stack
by ox 14 it pops rbx and rbv and
it does a memory write to this
address. So this is actually on
the stack. It doesn't memory
write onto the stack. And the
memory read from address that
depends on rvp. This gives of
information. In fact our next
step is to implement rop
compiler based on this. We were
hoping to have this but not
quite ready but stay tuned.
Another thing that I'll demo off
is a how to solve a crypto
challenge in anger. This is more
of a crack me. But it's a cool
little demo of anger's ability.
The challenge is from a white
hat ctf. It's a ctf that
happened last month and then I
was looking at the challenge
later to see some crypto. And
found this. Figured it would be
a good example for you guys.
This challenge takes input on
the command-line and standard
crack me fax. It tells you if
you are right or wrong. In
general we try to guess, we are
wrong. Let's open up an ida. We
start looking around and the
binary is really big. So, of
course, we can start drilling
down into parts of the binary.
Figuring out what it might do.
We can did he come pile it and
try to figure out. One of the
first things we see immediately
is it does something. And if
this return zero, it says please
check again. All right. Let's
look at it. And it does some
complicated stuff. But there are
equals equals eight. This is
part of the process for solving
the challenge. So I went, quit
out of ida. Went to anger. I
wrote a little...whoa a little
bit of code. So it just heavily
commented in our example
repository. We opened the
binary. Anger, symboliks
execution have trouble with
certain kinds of code especially
in static binary. I hook with
python replacement to help them
along. And then basically I ran
it. I said, I created a path gui
which is a single entry point of
symboliks execution engine and I
told it to go and find this
point. And this is the point
where it says it passes this
stage and it says the input is
okay. And it does some more
processing and this is the pain
in the butt to get through. So
let's look at where we are at
this point. It turns out at that
point the key space is much
reduced the possible key space
to make that point in and after
that is root forcible. So here I
get the possible value from
anger. And, of course, how to do
this is all not docs. Or look at
this example. With the very
fancy practicing bar and test
every possible value until I get
the right now. Allowing me to
solve this crypto challenge. So
let's see how this goes. Run it
here. Here anger stepping
through the binary, it's at this
point where the input was is
tested again. I guess at this
point. And now it's just trying
to reduce the set of
possibilities which found from 8
bytes to 6 thousand more
possibilities. This is an
example guess debugging
iterating through the possible
keys that can even make it
through this point and try to
find one that says success. It
should find it at the 80% mark.
I'm surprised that it hasn't
crashed yet. Boom. We found it.
The flag is this. If we run a
crypto, actually this is the
input. If you run the binary,
boom. So anger is very useful
for these sorts of challenges.
I'll pass it onto fish to look
at real world or ctf that's
happening now and how to use
anger for that. >> So one of the
anger's ability is load up
binary, execute arbitrary part
of the code in it. I had some
demos for it before and prepared
Defcon but yesterday when I was
playing ctf Defcon there was a
challenge for another. This is a
good one to talk about for
anger. >> Cover your ears,
please. >> Rxc is 64-bit binary.
It's big reverse is hard. We
spend a long time reversing it.
Before that we got some
suspected rob chain. What does
rob chain do? I mean we can
definitely hire a bunch of
monkey to figure out but we have
anger. >> The monkey we hire
would be ourselves. >> So this
is our rop chain execution
program called derop. You pass
the rop chain load the binary
rfc and dump all our stack.
Create a state. I dump that on
our stack, and then I execute
it. I use explorer, execute and
return. Let's run this program.
Python rop chain to pi. We
return the first rop chain.
Bummer. It's called r. >> Very
descriptive variable name. >>
There's an unconstraint path. >>
Of course, this is all. >> Fie
rows. >> This is technical. You
can read the documentation to
see what's going on. >> This is
the exact path that rop chain is
following. And now, of course,
you have the ability to read
every single state and every
point in the chain. The next
example for the same binary, in
this binary there's a really
interesting function. It does
some encryption. And later on we
figure out it's t. We don't want
to implement out when we were
writing exploits -- (inaudible)
-- what do we do about it?
Luckily we have python. Great.
So there's another program I
wrote it's small called
collarbone. What it does is it
takes in a data live and
encrypted with the exact
program, srd exact function in
that rxe program with the exact
encryption function. So it has
30 something lines of code and
then you don't understand to
encryption function anymore. You
spot python and automatically
encryption for you. Let's try
it. Python. Call it pi. 8 bytes.
And then you get encrypted data
that it all works. >> Whoa.
Binary dipping but interest of
time you can check this out on
your own and we will briefly
talk about cdc. You know it's a
cyber grand challenge. That's
the machine, one of the machines
that will be running the finals
where machines will battle each
other for hacking some premises
next Defcon. Shelf fish accept
this challenge and we manage to
qualify. There has been a lot of
presenting. Go back. This is a
very clever sets of slides.
Shellfish participated in this
challenge. And we qualify
putting from just another ctf
into the richest ctf teams in
the world along with others who
qualify. With the cdc we use the
cyber link system exploits from
binary and patch them. It is
complex and anger actually sat
at the core of every component.
Which is pretty cool. So check
out the system. It's real world
system with real world uses and
we love it. And it's
open-source. With special thanks
to our professor darpa with two
different project anger was
developed for. And, of course,
all of, the contributors to
anger that we've gone over. You
can pull it at get hub. Anger
dot o scribe to our mailing list
and we welcome questions. We
were hoping to make this next
generation binary analysis tool
and we hope to work with you to
do it. Anger is two years old
now with almost 60 thousand code
about 6 thousand commits and we
love all of you working with it
with us. Any questions? I guess
no questions. [ applause ]
00:45:50.648,00:00:00.000
Thanks. >> Thank you guys.
