
English: 
So cryptography is the idea of
encrypting a message so that
although everyone knows the message has been sent
they can't actually find out what it means.
Whereas in steganography we're
trying to hide the fact that we've sent a message at all.
So a classic example would be if i was
writing you a letter and then I wrote in
invisible ink a whole different letter
between the lines or on the other side
or something like that... And only you knew that that was going to be there.
So you get home and everyone else maybe looks at the letter
and thinks "it's not very interesting at all".
And then of course you can uncover the secret message.
Today we'll talk a bit about "digital image steganography"
because obviously there's a huge scope for hiding things in digital images:
Images can be megabytes or more
and you can hide files of megabytes or more in them.
But of course as the the amount of steganography in images increase,
so is the attempts to try and find it.
So there's lots of statistical approaches to try to find these things as well.
Perhaps the most simple form of steganography an image is

Turkish: 
Kriptografi bir mesajı, herkesin mesajın gönderildiğinden haberdar olmasına rağmen
kimsenin aslında mesajın içeriğini  anlayamaması için şifreleme fikridir esasında.
Oysa stenografide, mesajın gönderildiği gerçeğini de saklamaya çalışırız.
Klasik bir örnek olarak, eğer sana bir mektup yazıyor olsaydım, satırların arasına
veya arka tarafa görünmez mürekkep ile bambaşka bir mektup yazmam
veya buna benzer bir şey... Ve onun orada olduğunu yalnızca sen bileceksin.
Eve gideceksin ve belki herkes mektubu okumuş olacak
ve onun önemsiz olduğunu düşünecekler.
Elbette daha sonra sen, gizli mesajı açığa çıkarabilirsin.
Bugün, biraz "dijital görüntüleme steonografisi" hakkında konuşacağız
elbette çünkü dijital görüntüler üzerinde gizleme yapmak için geniş bir ihtiyaç var:
Görüntüler birkaç megabyte veya daha fazla olabilir
böylece içlerine birkaç megabyte veya daha büyük dosyaları saklayabiilirsin.
Elbette görüntü dosyalarındaki steonografi miktarı arttıkça, onu açığa
çıkarmak için gösterilen çaba da artar.
Yani, bunları bulmak için tonla istatistiksel yaklaşım da mevcut.
Belki de bir görüntüdeki steonografinin en basit formu

Turkish: 
 
Diyelim ki elimizde herhangi bir türden bitmap dosyası (PNG veya BMP) var
bu durumda en alçak bitleri mesajımızı oluşturacak şekilde değiştirebiliriz.
ve asıl resmin nasıl göründüğü üzerinde yaptığımız değişikliliğin
fark edilmesi neredeyse imkansız olur.
Mesela 800,351 sayısını değiştirirsem
1 veya 51'i değiştirirsem bunun
güçlü bir etkisi olmayacaktır... -Kesinlikle doğru, sayı öylesine büyük ki bu türden
büyük şeylerde küçük değişiklikler fark yaratmaz.
Genel konuşursak, bir resimdeki her bir byte'ı değiştireceğiz
en sondaki biti veya belki de sondaki iki biti değiştireceğiz...
...eğer bir sürü veri tıkmaya çalışacak olursak.
Her byte sekiz bit'ten oluşur, sondaki ikisini alacağız ve mesajımızı koyacağız
…kimsenin fark etmemesi umuduyla.
Yani, her byte için (toplamda 8-bit) altısı görüntünün kendisine ait olacak
ve diğer ikisi de bizim gizli mesajımızı oluşturacak,
yani, mesajımızın dörtte biri gizli
Sıradan bir pixelimiz olsaydı 4 byte uzunluğunda olurdu

English: 
"least significant bit steganography"
So if we've got a bitmap of any kind (a PNG or BMP)
then we can change the lowest bits to be our message
and you'll have an almost imperceptible change
on the actual way the image looks.
It's a bit like if I change the number 800,351
if I changed the 1 or the 51 on that
it's not gonna have a massive effect...
- That's exactly right, the number so big that
in the grand scheme of things it makes no difference.
So generally speaking we'll change (in an image) in every single byte
we'll change the last bit or maybe the last two bits
if we really try to cram in a lot of data.
Every byte is eight bits, we take the last two and change that to our message
in the hope no one's going to notice.
So, for every byte (that's every 8-bits) six of them are the regular image
and two of them are our secret message,
so a quarter of our message is now secret.
So if we have a normal pixel, it's going to be 4 bytes long

English: 
(so that's one byte) so for each byte
we're talking about the last two bits in that byte.
So that could be a 1, we can
change it 1, change it to 0 or leave them both the same.
And what we do is we read
off our message
so let's say our message we're trying to
encrypt is 10 11 01
Okay? We get to the first byte and we say:
well this is great,
our first two bytes are already 1 and 0 so we don't need to change anything at all
so that byte stays as it is.
So we go to the next byte, so this will be maybe red
and this might be green in our pixel.
Okay? The last two bits of this byte are 0 and 1,
the two we're trying to put in from our message are 1 and 1,
so we change this one to a 1.
So by changing that second least
significant bit from 0 to 1
we've just increased this value by two
and we're talking about one channel
in a huge image - changing by two levels is probably not going to be too noticeable.
If we start changing the highest significant bits then that might be a problem.

Turkish: 
yani her byte için sondaki iki bit hakkında konuşuyoruz.
yani şu 1 olurdu, onu 1 yapabiliriz, 0 yapabiliriz veya her ikisini de olduğu gibi bırakabiliriz.
Ve şu anda yaptığımız şey mesajımızı okumak,
diyelim ki şifrelemeye çalıştığımız mesaj 10 11 01 olsun
Tamam mı? İlk byte'a geliyoruz ve diyoruz ki: pekala bu tamam,
ilk iki byte zaten 1 ve 0 bu yüzden bir şeyi değiştirmemize gerek yok
yani bu bit olduğu gibi kalıyor.
sonra sıradaki byte'a geçiyoruz, belki kırmızı olacak
 
 
 
 
 
 
 
 

English: 
All right, so I've written a program to do this
and I've tried to hide a rather large file
inside another rather large image
OK, so this is a nice picture of a tree
It's about 3 (and a bit) megapixels in
size.
So this is the original image of our tree
and that is the steganographic image.
First one, into the second one.
- It's not changing!
- It is changing.
When you only change the last two significant bits
of an 8 bits per channel image
you're not going to see a huge amount of detail.
if you actually do a subtraction on the images, you can see a difference
but in general it's going to be pretty imperceptible.
The really good thing to do would be to never release the original source image.
I can tell that something has changed because I've got the original and
the new steganographic image with me.
But if I just sent out an image of my dog
and I never sent out the original that the camera took
no one's going to know that it's been
imperceptibly changed
because they haven't got a reference.
If you take a public domain image and change it,
it's gonna be easy to look for the original source.
- [...]
- Exactly
The other thing is, it'll work better on photographs where
there's a lot of variation (in the intensity levels anyway).

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
So this steganographic image
has the entire works of Shakespeare buried in it
which comes to (when it's zipped up) about...
1.5 MB, something like that.
This kind of simple steganography can be detected.
This image here is an image that I've created
by taking only the last two bits of each channel.
I've scrapped all the other information.
If a pixel has a value of 0, its black,
if it has a value of 3 it's white, and then it ranges in-between.
And you can see that it's a tree there, so you can see even in the first two bits that there's a tree
and the sky is particularly bland.
So if you look instead at the steganographic image
I've done the same filters to that
and you can see but the amount of noise is increased massively
because that noise is all hidden in those least 2 significant bits
So you can see if you compare the bits
from one image to the other, you can see a difference and so
hiding a messages in the least significant bits is fairly obvious
particularly if you have the original for comparison.
So this is the difference between those two images
and I've massively scaled up the difference
I mean it looks very gray.

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
these white pixels and black pixels are
values of plus or minus 3 intensity changes.
So we're still talking very
small differences over the image
and it's very evenly distributed
it's all sort of spread noisily throughout the image
- Yes so you can't actually tell that there's a tree there now.
- No, you can't tell there's a tree
there. Which could be a clue!
Perhaps the more sophisticated method of hiding something in an image would be to hide it
inside the Discrete Cosine Transform Coefficients of the jpeg.
So we talked a little bit about the DCT
and how we convert an image into a series of cosine waves.
And we have coefficients saying
how much of each of those waves we have.
If you change those coefficients instead of changing the raw pixel values,
you have a much less predictable effect on the image:
if you change the value of one of the alternating large current coefficients
from 202 to 201
you're gonna have a very imperceptible difference and it's going to happen over
that entire 8x8 block
so you're not going to be able to see
the clear sort of steganographic noise
that we just looked at on that tree.
A common algorithm that we see in use is called JSteg.

English: 
So I see what you did there.
And what JSteg does is it goes in and if it can it will cram
DCT coefficients full of as much data as it can.
And what it does is: any coefficients aren't 0 or 1 (because they might change and be a little obvious)
so usually be low frequency ones might change up or down
and you can see again that the difference is almost imperceptible.
So here's a picture of a panda and what I've done here:
I couldn't cram in as much information as before so it's just much better than this one.
So there is the original and a
steganographic image.
And I've looked at these and I've [...] a bit of difference
and you can see that again
it's very, very, very slight so these
pixels again have only changed by
plus or minus 3 maybe one maybe two.
- So that's just zoomed in on the...
- That's zoomed in on the difference right there
so you can see that, yes, the pictures have changed, but have not changed by a lot.
And the other crucial thing about hiding your message in the DCT coefficients:

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
the jpg has already completely messed up the least significant bits of the image.
So if you do an image like I did where
we're looking at just the bits,
we will no longer be able to see a tree, w'll just be able to see very general jpg noise
and that will be exactly the same in our steganographic image,
so you can't do what they call a visual attack
by looking and seeing if there is a
steganographic message hidden inside,
because there's no real change.
So this is the original and i'm only
showing here the least two significant bits.
And you can see that they form into
little blocks
[...] blocks is the 8x8 DCT blocks.
And this is the steganographic data so
you can see that the blocks have changed,
but the distribution of noise
throughout the image hasn't changed at all,
so it's very difficult to see 
there's a message buried in there
And if the message took up only a certain amount of the image
it's hard to see where in this image the
message is.
You could be trying to read off every DCT coefficient
when in fact only some of them have a message in.
- If you were sending this to someone as a message...
how would they get it out?

English: 
- OK, so in general you would also encrypt the message
because, you know, you better be safe than sorry, why not use encryption.
So we encrypt our message, we put it into DCT coefficients
or in the least significant bits, and
then we send it off to someone.
Now, they're going to have to have known the process that we used because if they don't,
they're gonna be looking in the wrong
place, so they'll know
that we used J Stag or F5 or one of the other DCT steganography tools and
they're basically run the program
they'll type in their decryption key
which will actually remove the
encryption and then out comes the message.
When JSteg was invented, it was robust a visual attack
so you couldn't look at it and go: "well that clearly has been altered".
So they had to try and come up with -research had tried to come up with- some other way of
detecting that an image has had a JSteg message buried in it
and what in fact happens is that
the coefficients change ever so slightly.

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
Because we're applying quantization to our DCT coefficients, most of them will be set to zero.
OK? And JSteg won't put anything in there, because it's too obvious;
it'll only put them in a few at the top corner that are big,
and you'll find that there's a subtle imbalance produced in where your coefficients are
so you expect most your coefficients to be 0 and then a fair few of them to be -1 or 1
and -2 and 2 to be very close to
zero.
And in fact you start to get a few 3s and 4s that you weren't expecting
and the distribution of these numbers goes off a little bit and you can start to predict
that the JSteg file has been buried inside.
What's more is that this happens in each 8x8 block, so you can actually do this test
on every block and find out which blocks have messages in them, which books don't.
And you might find for example that the first 60% of the file has a message in
and then abruptly stops and that's a blatant clue
that we've got something that isn't  taking up the whole image
It's has just been written sequentially into the file.

English: 
So if we take the frequency of the
number of occurrences of each
DCT coefficient - so nought (0) is
going to be the most common,
there may be -1 and 1, and we plot those in a graph with
frequency on the Y-axis and the DCT coefficient on the X-axis
we get what's called a histogram and that's simply a plot
of the frequency of
occurrences of various things.
So you can do a histogram on an image, but you can also do a histogram on these
DCT coefficients and find out whether
they've been imperceptibly changed.
Once people started routinely detecting JSteg, other people came along and
decided well that's, you know, it's too
obvious, so let's try and make it be more subtle.
So what they did was they
wrote DCT steganography approaches
where they pay attention to the statistics of the coefficients
and try to keep them balanced.
So if you put in a 1
you try to take one out somewhere else so that you keep the histogram
and the probabilities of these coefficients occurring at the same.
And that makes it much harder to use your standard histogram analysis technique

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
to find out whether there's anything in the image.
but now what they can do with the power of machine learning is:
Take let's say a thousand images, 10 of which may or may not have
something buried inside and a
classifier will pretty well find out which ones they are.
You just have to have a lot of positive and negative samples to throw at it.
- It all sounds wonderful but you know, [???]
- Well. yeah, so aside from spies I should say i'm not using these techniques.
you know everyone's watching....
- [???] are going through your Instagram
- Exactly, so I think one of the most common uses is digital watermarking.
So in normal steganography what we want to do is
try and hide a message as well as
possible.
And then all that really matters is the person on the other side can get it
and no one else really notices.
In watermarking what we want to try and do is fingerprint to file so that we know
where it came for we know it's ours,
maybe for copyright reasons or to
trace who's been distributing illegal material.

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
And the key to a watermark is, instead of it being as much payload as possible
so instead of trying to cram the entire
works of Shakespeare into an image
what you should be doing is just a small... let's say a small logo or a small piece of text
repeated over and over so that if
the image gets cropped image gets re-compressed, it's still there.
You can imagine that stock photo companies might do this
to try and make sure that people
aren't distributing their files elsewhere.
And you can imagine that
they would stroll through the web
looking for steganographic images
embedded in their particular way.
Another case you might find if you were distributing pre-release DVDs of a film
and then it gets leaked onto the internet...
If there's steganographic data on the source buried in, you'll be able to see who it was that leaked it.
- Each file could be tailored...
- Each file could be tailored with
the person they originally sent that file to
and then when that particular find their way onto the Internet, that person is going to be in trouble.

Turkish: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

English: 
What was vital in recreating this image is now gone and we're not going to get it back
and in fact that's exactly what you do
see so if we show the actual output here
we can see [...] is kind of visible but
it's been completely dwarfed by
all this random noise that's been added...

Turkish: 
 
 
 
 
