Alright
Welcome back everybody
We are getting very close to the end of the term
Believe it or not
Project 3 is due soon
It will be under our last project so we're getting almost there
Remember last time we were talking about
File systems
And
Okay so let everybody one of the things that we were talking about was the notion of delayed rights
And first thing to point out is that a lot of file systems delay their rights they don't squish them out to the disc
Right away
And so basically
You know in particular if you open a file and do right to it the date is actually in cash in memory
And it's only later that it gets flushed out to disk
So this is something to be aware of if you think you've got your data
On desk cuz you've opened written and close the file it may or may not actually be there
Tell this is an advantage because what it does is it allows the disk scheduler to combine blocks that are together
On the same track
Potentially get higher performance and potentially delete files in temporary space before they're ever Act
Actually written to disk
The disadvantage of course is if the system crashes
Before it's been flushed out and you lose your data
All right and that was what let us to talk about for instance logging systems
Journaling systems Etc where there any questions I'm delayed rights
Okay we good on this
I am so that number by the way
It is not necessarily fixed but you know Unix
Does sort of wait about 30 seconds before things could actually flush
So that is a window of vulnerability in a typical system
Questar Gas
Even though it's not
Yeah good question so even though it's not on the disc the user does see it as if it's on the desk and the reason is all
Recent riots on the local
Machine at least go through the same cash
Now wear this gets a little dicey is if you're talking about something like NFS or something where it's not all on the
The same disk
And then you got to be careful
And NFS actually tries to flush things out a little bit more frequently
In fact the original NFS
Speck had no cashing on the local client all the data always had to go directly back to the server first
And you got a penalty as a result so
Good question any other questions
Okay
The other thing we talked about if we talked about some data
Preservation Technologies like raid 5
So I actually talked about both raid 1 and raid 5 raid 1 if you remember is mirroring
Which is actually have 2 discs and all the data goes to both disc
Good thing about that is it simple
It's relatively fast
The bad thing about it is high overhead okay so basically if you want a terabyte of storage you need to buy
Buy two 1tb drive to do mirroring
Raid 5
Is sort of a trade-off between performance and storage overhead
So here are giving an example where for every 4 data
Dips there's one parity disk
So for instance if you've got 42 storage would actually need 5 1tb
This is pretty common
It's a little bit slower
In some instances then mirroring but can also be faster if you're reading and writing a lot of data
The basic simple idea is each of these represent disc block so we have discs going down and disc
Blocks going across
And what we do is we take D 0 1 2 and 3 which are real data we explore them together and we get a parody
Bite or parody blocks use me
And what do I mean by axillary blocks together I mean you take the 4K block
Stretch it out as bites
Stretch it out and spits you act or all of them together
And that basically
Gives you a block size item over here
Okay everybody with me on that
And the thing about ex or is it has this nice property for instance that although we generated
P0 from the four original data disc we can actually regain any of the missing data say D3
He died by axle ring
The remaining gets together with p 0 and the net result is we get our data back
Now
What's interesting about this is
What about the sex or is able to figure out which disk failed anything is there anything in the x or itself that
Figures out which disk fail does that have to come from outside
Comes from outside right
How do you know the D3 is bad maybe D3 is good
Harry only way you know D3 as bad as cuz the disk controller itself said it was bad
Okay so in the pantheon of error correction codes if we were to go down this path you'd find out that the read
5 parity code actually
Can only correct an error if it's got outside information otherwise it's more of an error detection code
So we could
Detect the song disc is bad by Exo wearing all of these together and not getting zero
But
That doesn't tell us which just goes bad so we're actually relying on the disk controller itself to say
FIFA disk is bad
And why does it do that well there's error correction codes that are actually written on the disk with the bites in the block so
This particular code crucially
Relies on something else telling you which block is bad
Okay are there any other questions
Okay
I actually so this is generalizable by the way raid 5 is actually a
Simplistic reed-solomon code where there's one bit of redundant so you can extend this arbitrarily much
Well not quite arbitrarily but quite a bit
With the reed-solomon coding in general and you could put on a bunch of redundant blocks if you wanted I can put
Another
Two I can put 200 on it blocks that's called rain sticks
In which case I can allow any two discs to fail before I lose my data
And it turns out in today's systems if you have a terabyte or two or three or four 12
You should make sure to use Raid 6 because
Hey buddy tell me what the biggest advantage to being able to handle due to disk failures is
Rather than just one
Yeah
If one gets goes out to again
What when does goes out you can keep running the machine in theory with this raid 5 as well
Yeah go ahead
There you go
The real important part here is a disk fails on rebuilding and another gift fails then I'm in trouble
So basically by having two
Parody blocks then we can bake basically
Make sure we don't lose our data even if
Another just goes out while we're still repairing so that's kind of why it's important
And most controllers off you what's called raid 6 that wasn't actually in the original set of raid
Algorithm that were looked at by Patterson but it's kind of called raid 6
Sort of informal
Okay so today I want to
Finish up on file systems a little bit by talking about authorization
And we will finish this up when we get into cryptography in a week or so and then we'll pop our way onto Network
That's our next major topic
Okay
So
Authorization
Is the generic term for who can do what
Okay and
How do you decide who is authorized to access a part of your system
Well
You can talk about for instance this Access Control Matrix idea where you have a set of items
I'm going to call him domains for a moment
On one side and it said objects on the other in each domain is a potential user so it could be a person
It could be a group of people whatever and the objects are the things your controlling access to and
An empty spot in this basically has no access and otherwise maybe read says that domain One Connect
Access object F1 as long as it only tries to read
Okay so authorization can be reduced to a matrix like this
In principle
Okay the resources are across the top he's going to be almost anything you can imagine having control access to it could have
Actually the subject so you could have this could be a huge database and what we're really talking about is a column for every item
In the database if you wanted to go that fine
And then the domains are like I said users groups of users excetera
And in practice in practice the problem of this thing is it's huge and sparse
So most of the entries are empty
Okay most of the domains have nothing to do with most of the files
And as a result the table is a very bad way to talk about authorization
I mean it's
It's kind of
Maybe a good way to introduce it but it isn't practical
Hey there is no table like this in any system you're likely to use
Okay
So
We kind of have to implementation choices here where what we're trying to do is
Deal with the fact the tables Hudgens Parson and the most common one which you're all familiar with is an access control list
And this is a notion that with the object is stored the permissions
And the permission say sort of who's allowed to access what
So for instance Unix as you're well aware sort of limit each file to read write or execute permission
Things for owner group and World domain
Okay and access control lists
Very easy changing object permissions just by adding
Something's Access Control list
You go to this thing associated with a file and you change the access control
Okay and
The interesting
A couple of interesting things about this so as I mentioned the access control this is actually stored with the object
The second is when you go to change the access control list somebody is start of verifying your identity somehow
So do you actually have permission to change the axle on a file
Hey it would be very bad if
Okay this is this particular file I don't have permission to but I can go change the access control list so I do have permission
Okay that would be bad
And so part of the management of the access control list is verifying identity of people and will actually talk a bit more about it
Identity in a week or two
Are too many weeks left
Very soon
About
Let's assume for the moment that we actually have the notion of identity
The other thing I would note is anybody here ever used
AFS which is the Android file system or DFS
Any of you okay if you
Anybody ever used windows
Okay yeah you can raise your hand without Jamie to
So
Windows is the one you're most likely to be familiar with but several other unix-like files
Assistant at this property to where the access control lists are actually fairly sophisticated
Right if you go on to a file in your right click and you go to the permissions you can arbitrarily give
Any of the users in your domain different permissions
So I'm like in Unix where you're forced to sort of have owner group world
Permissions
In
Windows and the end of file system NTFS which was derived from it in a few other more sophisticated systems you can act
Actually have arbitrarily complex Access Control list and so that's kind of
I'm more interesting model they're insertive closer to being able to implement the table
Now the other
Thing other than an access control this is what I would call a capability list
And in this case
It's sort of up to the processor the user or the domain to track what they have access to
It's a little different so remember the access control list case the files no
Who has access to them
In the capability list case I as a user keep track of what I have access to
The easiest analogy here is a keyring right I've got my keychain those keys are the things that I have capability
Please for the rooms I can get into
Okay
And you know it simple examples of page table sort of each process has the list of tables that it has access to
You right in the face table
Okay
Now
This one is
Tight and very familiar to you cuz you've been implementing it and it's basically got an intermediary which is the operating system
That make sure that you don't change the capability list in a way that's not proper
The operating system make sure that that key ring which is the page table in this case is properly managed so it isn't quite
Leica keychain
What's more like a keychain is a modern system where you're actually giving a cryptographic token which is a Pisa
Opaque data that's been encrypted in some way
And if you have that data then you're allowed to access some service by giving it out
Hey somebody says well prove who you are and that you have access what here's my
Cryptographic key
Okay
And we'll talk we'll talk more about how to construct those toward the end
Anybody here taking 161 or okay a couple of you
Alright so you let your Serta more familiar with those ideas but will make sure everybody has some idea about them
Are there any questions about these two options
You could try to think of this like implementing the rows in The Columns of that Star Stable
Directly against the columns in the roast but anyway
So
We can also combine things together so for instance
Users in typical Unix system actually have capabilities which we might call groups or roles
And then on the objects you have access control list which include
Groups are also another way to look at sort of eunuchs
Is
That you have the capability which is your group and you get that
Start of authorise when you log in and then that group
Is look up in the access control list
And the nice thing about that is it's much easier to say while this file is accessible by everybody in group X
And then it's all up to you to get group permission
And then suddenly you'll have permission to whatever was in that group it's a little bit easier to manage by using this hybrid approach
So how do you revoke authorization
So anytime you talk about authorization there's always a question of revocation simple example
You got a roommate
Who you had a falling out with you kick him out
Through their couch out the window
And now you got to revoke their access to your place
So how do you do that
Well yeah you got to call the locksmith
Okay although I actually learned recently that there are new smartkey approaches that are kind of cool which with the
The master key
You can go and put it in a learning mode and then stick a brand new key in and change the lock yourself this is
Several several of the major lock manufacturers are going down that route
But anyway she got to change the key
Right you got to revoke the key and the problem with a cryptographic approach
Is that revoking the key may or may not be easy to do
Hey doing an access control list example it is kind of easy cuz what you do is you remove the person from the access control
Prolastin in the next time they try to get entry it says
Sorry you can't you can't answer
Now what that is basically equivalent to as I guess you've got
You know you got the guy with the clipboard of who's allowed to be in your house and he stands outside waiting for everybody
You're not on the list I'm sorry
Okay that would be the equivalent in an access control list
Takes effect immediately
Because the access control list is checked every time you get in
And it's a little harder to do with capabilities as I mentioned so the temple thing is
You change the key
Not so bad in a single machine because all the keys are probably kept somewhere in a well-known place and so you can go out for
License and remove the key
From the person's key ring
On the machine and as a result not actually have to change the key
Okay that would be I guess you tackle your roommate on the way out and remove the key
Okay
In a distributed system that's gets hard okay because basically he's a really these little cryptographic pieces and
Information and they might be stored somewhere it's not even accessible on the net or maybe they're still stored in their phone or so
Something and it's very hard to reach out and remove them and so that's when you got to start thinking about changing locks
Okay and I will say there's a whole topic that you can get into on revoking capabilities which is kind of interesting
I'll just give you a few for instance you could take that capability which is like a key but it's a piece of cryptographic
Data and you might put an expiration date in it
What that says is yeah I'm a kick you out but
And you may have temporarily still get to have access but after the expiration date Yorkies no longer good
Okay
You can put some sort of epoch number in the capability and what happens is if you got to get rid of a bunch of people's Keys you started
You think I'm at the epoch number and no longer do their keys work
You can have back pointers
So this gets this works well on a single system where you can somehow pointed all the capabilities and remove them from
The users but it's not so good in a distributed sense
You can have a revocation list
Okay this one
Is kind of common with
Public keys so the idea might be you have how many people have their own public key that they use to sign messages with
Let me start with how many people have their own public key
Tell me people sign messages with him
Good
Okay
What happens if somebody gets
The private key associated with the public key
Okay so now we got a problem because suddenly that person who found your private key can send messages as if they're from you
When everybody believes their from you because
Their sign
Right yes
Yeah so now you get into a funny case where I use my key to sign a revocation and then the only problem with that is you got to May
Make sure that anybody who's likely to look at something signed by me is likely to see the revocation list
Which it's a little tricky
So
So basically public keys are tricky and websites
Use public keys to verify that they are correct
So when you're going to your bank
And in theory you get the right
Signs
Certificate that says this is truly your bank
What's the chance that that is a Ford certificate well
Depends on whether the banks private key was recently
Compromised
There was a
Quesadilla in the last will say five years
In which the Microsoft update website
Signing key was compromised
Okay which means that people could actually
Give out software that was in theory updates then when Theory signed by Microsoft
And there's no way to know that they weren't official
Okay
This is an interesting problem
Alright any questions
We'll talk more about this when we talk about public keys but it is it is a problem
Alright
Let's talk a little bit about
Let's change topics now
What's the difference between a centralized in a distributed system
Why you been living in an era where everything is distributed so you probably haven't really thought much about this
But in a centralized system
Perhaps there's a single Central server somewhere that everybody accesses that started the client-server model
And major functions are all performed by
Some single computer
So this is a
Pretty good model for I don't know
Banking on a website
Maybe for your storage that you have
Fairly centralized
Here's an
Total alternate in which is the distributed system this is cousin
Or
Napster
Or any other peer-to-peer system where there's no Center
Okay
Physically separate computers they're all kind of the same and some sense
And
Maybe they're in the same room maybe they're not
Often times they might be called a cluster if they're in the same room if they're distributed across the internet at large they might
Have you considered a peer-to-peer system
Okay
Answering this
Model
Which is potentially more powerful than this model because you can invoke a lot more things they're sort of ends
Squared connections here
It also has some potential confusion like what's the most up-to-date version of the data
Where's in a centralized case you know exactly what the most updated version is it's what's on the central server
So
Why might we want to actually go all the way toward a distributed system
Well one you might be it's much cheaper to put together a whole bunch of computers
To get a big system that it is to build a single big computer
Okay
It might be argued that it's much easier to add capacity incrementally
Hey how many of you have done web searches on Google
Okay
Is that a centralized that was a hopefully everybody said yes is that a distributed or a central
Why system
Hey turns out it's actually quite distributed
The View that Google has is sort of the machine the building is the computer
Okay and there might be hundreds or thousands
Of machines in a given building and when you do a query against Google
You're not necessarily going to anyone computer you're going to a distributed system
Okay and the reason that that's such a powerful model is they don't have to build
A super duper duper super computer
The handle all the requests are coming in the crossing that what they do is they just put a bunch of machines together and as they
Need more capacity add more machines
So it's it's great from an incremental scaling capability
Some other things that have been talked about is peer-to-peer systems
Is Prince infusers that might be part of this system
Have complete control over their own domain so each machine as their own
Okay so
Although peer-to-peer systems got most of their Fame
For you know people Distributing music possibly illegally
Yes go ahead
It actually the question is
Do you have a bunch of independent connections to every computer or do you have a huge pipe and I would say it's either extreme
Reamer bolt in the middle
Depends on the structure
In fact often in a peer-to-peer system what you got is you've got to set a neighbor's that you have direct pipes to and then
Forward through neighbors to get the other people
So that's kind of a combination of the big pipe and the Very distributed
The anyway so in a system like this
What are the things that peer-to-peer systems actually
We're targeting which was kind of a good thing was an anemone
The ability for somebody in a country that may sensor
Their news or whatever to actually get
Their story out
Okay without being traced
Back to the particular source of the story
So that level of anonymity in countries that sort of have less
Freedom of press is actually a good thing and you can get that in a distributed system much easier than you can in a server where are you
Kind of have to give your identity when you submit your article
Okay
So
For the promise of distributed systems are the following things like higher availability
Why won't because even though one machine might be down
There's many left
Shoji the system as a whole is available
Higher durability
Well
I can store data all over the place and therefore it's harder to destroy the data
K more security
Hey maybe this one to scratch but in theory
The things you're protecting are smaller and more focused and you can do a better job of the things that need really high security
Dirty rather than trying to apply the same security across all the data
Hey so maybe you are maybe you believe that maybe don't
What's actually happening is been a little more disappointing and having been in the peer-to-peer distributed
Computing domain for now a decade or so
What we've often found is
You often have worse availability
Anderson's famous Lampard quotes
That says a distributed system is one where I can't do work because some machine I never heard of isn't working
Anybody had that happen to them
Where do you start a figure out that there's some DNS server down somewhere or some Keen you no other service somewhere
It's down and as a result even though most of what you need is up you can't get to it
The reliability has been worse in many cases because you can lose data if any machine crashes because the metadata was
Wasn't properly distributed
And of course we all know security is often been worse
The more
Points of failure that there they are the easier it is to break in
Cell
That's unfortunate but I actually think these are all
Still recoverable these are kind of historical artifacts and things have been getting better
The other thing is coronations much more difficult and this is where things get more interesting I think
Because if you have multiple copies of share data
You have to somehow coordinate to figure out what the most up-to-date one is
So imagine you got some document you're changing it and you're pushing all your changes out to this big distributed system
Which one is the latest
You know what happens if the one machine that kind of knows what order you push these changes out crashes
Do the other machines know enough to reconstruct
Okay
And so the reason I really got into peer-to-peer in distributed systems was cuz there's some very interesting
Coordination questions I'm out of Serta verify what the most up-to-date data is and I'll say a little bit of that in a lecture
Probably one of our last three lectures
Okay questions
The only thing I really wanted to do here is to give you a notion that there's something much wider than simple client server
Can go much farther than that and we'll talk about some of these components as we go
But before we do that we're going to have to really talk more about how to connect things together and that's going to lead us down this discussion of networking and that's
Going to be our next major topic
But I got there any any other questions on sort of this brief intro to distributed systems before we go
So the question here is can you have a set of main coordinating
Computers yes so they're different architectures
Have been developed
And I would say everything from here
Through a Spectrum 2 ear have been in use
And by making sure that there's some subset of the machines that have control over say commit order of
Data
You can
Then be much more likely to be able to recover that order
What do we use at Berkeley we pretty much use client-server model that Berkeley
I mean there's not a lot of
It's not there's not a huge I mean your data that you store in the file server is not stored in distributed fashion
You know
There's been a push to sort of use Gmail in a few other things like that that are a little more distributed but most of the stuff you access
This is probably client server
I mean
I mean there's a few specialized instances but
Alright
Any other questions
So
Let's just say what are goals are here for a distributed system cuz we're going to need to approach these with the network so one of them is transparency
This is the ability of the system to somehow hide the complexity behind it and interface
It sounds a lot like my first lecture in this class right where we started bride in Virtual machines is a way of hiding complexity
The same is true with distributed systems
And so some of the transparencies we might want one princes location transparency
Would be basically say Well it may be the fact that my data stored all over the place but I don't have to care where it's store
Okay
Migration transparency might say well
In order to make sure my data has enough copies or has high-availability whatever the system can transfer
Apparently move my data around but I don't have to worry about is it in the middle of migrating while I'm reading it or
Did it migrate in Ohio yesterday that's more of a location transparency
But that's all hidden from me
Replication transparency
Maybe my data
Is giving more or less copies depending on how it's stored
Teddy stored on very flaky servers maybe they're more copies if you stored on stable servers are less copies I shouldn't care
Unless I care less I really want to know
Concurrency I shouldn't have to care how many users are on the system
Parallelism I shouldn't have to care whether my jobs gets fed up because they're using parallel machine some
Fault tolerance
Okay
The system should be able to hide whatever it has to do in terms of reconstructing lost data Etc Bhai
Find interfaces so I don't care
Okay
So
And of course ultimately to get this transparency we got to have a good networking later later and so we're going to start bottom up here
Now and talk about communication okay any questions on transparencies
How how how transparent how many of these transparencies do you think actually exist and say dated
Storage
Anybody used webstorage you guys use like store your photos on Photo sites or whatever
Dropbox
Is another one that's got some transparency right
You know the wet the photo storage sites is not a you have to go to a particular URL and you could probably find out where
For the day to actually is if you wanted to but it's sort of transparent
We also some of that does exist
In primitive fashion
So
Sleepiness trivia final exam
Don't do it at all
Is almost exactly a month away
Hey I'll be a month away tomorrow
So I just thought I'd I actually probably could find out what room it's in but I haven't quite done that yet I'll do that for you next time
All material
From the course
Is fair game except I'm going to have a special lecture on Monday and the rr-rr week which you don't have to worry about
That'll be just for fun
Two sheets to notes both sides
No sample finals up cuz I don't do that but you know midterm to Zoar up so you could take
Take a look at what they have look like in the past and you know what midterm ones look like you could probably go over those and you have a good flavor For What the
The final might look like
Okay
There is a lecture on the
Wednesday before Thanksgiving
Just so you know
So we're getting down to 6 lectures
And
It's upside was actually from last year when the RR week was new but you get that extra week of studying
We lost extra lectures but that's alright
So find a lecture
Eve
The optional lecture so I haven't seen anybody send me any topic so if I don't get any good topics I won't do the final lecture
But people need to send me
Things they might want to hear about yes
Fine but send it to me an email otherwise
Well I that's a good topic I'll be happy to talk about that
That's a great example of a topic but so people should send me topics so that I have something to put the
Together for a talk actually it's any
Enough topics that I can choose
Filter a little
Shakespeare to be or not to be that is the question
Filter filter filter okay
Okay I want beside check before you then
Examples real-time secure Hardware Quantum Computing whatever okay any questions
Hey so this would be the Monday after classes 12/6
So let's talk some definition since we're piling into the network topic here now
So Network
Is a physical connection that allows two computers to communicate that's a pretty General notion
But what are some
Some words that we'd like here while one of them is a packet
And a packet basically is just a unit of transfer
That's
Is atomic and it sent from one
Part of the system to another
And it's up to the network to basically carry packets from one CPU to another I'm going to have to talk about things like
How do you turn packets into useful messages could pack if they're often smaller than messages how do you deal with loss packets how do you
Route packets Mercedes are all topics here that are going to be dealt with fairly quickly in a couple of lectures
A protocol is an agreement between two parties of how to exchange packets
Okay what is
The process whereby I had you a packet that's part of a bigger entity and how do you decide to tell me that
Got it you got it or not okay that's going to be a protocol
Had a good example of a protocol of course that you're all familiar with his IP
TCP IP
Has it other protocol okay
And UDP these are the three that will definitely talk about but there will talk about a couple other ones here at a little lower level
The moment
Are there any questions either pretty simple
Straightforward definition
Okay so how big is a packet
When you think of packets you think
Megabytes
Kilobytes
Okay maybe
I go ahead
Okay I frame yeah but how big is a praying mantis is usually the lowest level trans
Mission chunk how big
1500 is a good number
Okay that's a typical that's the maximum MTU typically in a TCP IP network
Usually in slow networks they get smaller
And in Satellite networks it can get much bigger
So
Okay
Good
So let's start with a particularly simple type of network which is a broadcast network
And the idea is broadcaster doesn't necessarily mean through the air it's basically just anything where there's a Sheriff
Communication media
And the simplest example would you've already seen in this class is a bus
So boss is a broadcast media where are you have a processor on the bus
And maybe memory on the bus
And the reason it's called the broadcast media is
Everybody can hear everything that's being said
So when the processor talks everyone listens or when an IO device talks maybe everybody listen
Okay so that's where it becomes a broadcast
Originally ethernet was very much a broadcast media
Okay so when I was a grad student I remember there were
These trees that were actually in the in the ceiling and they went around
The whole floor
And there was this
10base t cable
Which is coaxial cable that went
Kind of in a loop all the offices and there was actually a drop like this they had teased in the office and that would plug into the computer
Okay so I'm principal every machine on the floor
I heard every communication that was on the floor
Broadcast
Question yes
Okay that's a good question so if you apply if you apply encryption
And I'm assuming you mean that not everybody's got the keys so they can all understand it that's usually still considered a broadcast cuz if it's
Physical media as broadcast
But that's a good point I mean one way to protect your Communications is to encrypt them and we'll talk more about that later
Buy from the sing it from the standpoint of the lowest physical level it's truly the case that everybody hears every message
Now
What are some things that people can tell me about a broadcast media that's probably bad
Yes
Yes
We just had one in fact we had two two people speaking at the same time yet so what do you have to do about a conflict
Yeah
Sulfur smell right after detect the conflicts
And then we're going to have to do something about it right so the fact that were broadcasting automatically causes a conflict issue
Now there's a special type of broadcast which I'd call kind of them that's often called The Master Slave
I didn't come up with that but
Situation where there's only one talker
Okay or if there's only one controller the bus and they get to say when everybody else talks in that case conflict is
Centrally managed
Even though the media itself is broadcast
Okay
So good examples of broadcast media so
You know Wireless cell phones GSM
Edge CDMA
Evdo these are all
You know all the different Wireless standards out there all broadcast Networks
Including
802. 11
Right so when you go into a cafe and you're drinking your coffee
And you're on the net in fact your packets are broadcasting to every other laptop in that Cafe
Okay and somebody who's got the right software installed on that laptop can look at everything you're Broadcasting
Okay
Keep that in mind
Make sure that you're always using secure protocols okay
So
Let's take a look at an example of how a broadcast media might work
So when you don't when you broadcast the packet and you're really only want to talk to one person
So why do use the broadcast media because it's convenient but you only want to talk to one person what happens
The packets Lily goes to everywhere
Right so we need to have some mechanism for filtering out
Messages we don't want to see so here's a good example where
The message actually has a header on it which says who the destination is so the sender here maybe ID3 the desk
Nations id2
And you broadcast it what happens it goes to everybody
But people who are not the destination
Throw out the packet and only the one who's supposed to get it actually keeps it
Okay now that seems simple but I thought I'd point that out
So we usually in order to turn a broadcast media into something that I can do
Reliable point-to-point communication with I got to have a filtering mechanism and I've got to have some way of saying who it is
I'm interested in sending to
Pandora said that's very different from a wire writer the wire it's got two ends and you know I'm sending delivers on the other
Randy The Wire
Hey there's no need for a header there in order to get to the other end of The Wire
Here we actually have to specify
Okay yeah I didn't ethernet
Typically this kind of checks done directly and Hardware so it's not up to the operating system you don't get an interrupt for every
Packet
Alright
Unless
What
What can you do with ethernet
Yes
Yeah you can put in what's called promiscuous mode
Which is the mode in which
Don't receive every packet coming in off the net regardless of whether it's supposed to receive it
Okay and this is
Obviously something you can do in a broadcast media with the right software use program the hardware Take It All In
The reason you might not want to actually take it all in his from a
Performance standpoint because if they're interrupts happening for every packet that comes in your system gets slow cuz you really are
I only want to receive what you want to receive unless you're snooping
Speaking of snooping
10 years ago when I first got here
It was already the case that some significantly large fraction of all of the computers on the Berkeley campus worry
Promiscuous mode because of viruses or whatever and we're busy snooping packets
And so was already too late
At that point
To use telnet or some of these other protocols which put your passwords in the clear because people would Snoop them
Okay so make sure that you're never actually typing your password in the clear you're always using SSH
Etc okay never use yes go ahead
Yeah
So the idea in the case of ethernet to actually the man the MAC address that's correct which is a 48 bit
Heidi that's correct
So
Basically
You can think of what I just said here is a form of layering and we'll get two more layering in a second but
What I've done is I've taken the basic broadcast protocol which is capable of delivering a messaged everybody listening
And by layering a header on top of it and an interpretation of that had her suddenly I've got point-to-point communication
Okay most of the larynx that people talk about in in networking is all about
Starting with some low-level media and working your way up until you can send a message all the way
Say from here to Europe convenia
Reliably
Okay
Questions
Sahara network card manufacturers make sure all the max are unique well
It turns out that
There are bits that are assigned based on manufacturer ID so I'm Theory people have a unique piece of the space
Now most
Routers You by today allow you to go ahead and set the Mac so it's not clear
You know if you still want to make sure they're all unique from a reliability standpoint but most cases you're allowed to fake anymore
Back you might want if you know what you're doing
But that's okay
So let's
Talk a bit about
This arbitration process which is how do you handle collisions
And arbitration is the act of negotiating a shared media
And I basically what happens if two senders try to talk the same time
And
Here's a case of concurrent activity but there's no longer shared memory that you can do a lock
In
So your notion that you know you guys have been programming concurrent systems where you can grab a lock of some sort that's now been thrown
Going out the door cuz there are no locks
So what are you do and we talked a little bit about this
Let me tell you one of the first broadcast networks called the Aloha Network
Without packet radio with in Hawaii
Okay
Anda
What was the blind broadcast with a checksum at the end so the ideas you go ahead and you start speaking
While you're drinking your drink with a little umbrella in it and
You put a check some on there and if it was received correctly by the other side
They would act knowledge that they got it and all is well and if you didn't get an acknowledgement back and figure it there was a collision and
And you
Retrenchment
K
Now couple of things here first of all
You need that check some anyway because there's lots of reasons for garbled messages not just
The two people were speaking at the same time like for instance an airplane flew directly overhead in garbled your message
Okay so the checksum
Is a way of
Seeing whether the message was correctly received and we were using it both for Reliable
Transmission and to detect Collision
And how many of you know what a checksum is
Hey somebody tell me what a checksum is
Yes
Church give me example of computing 1
And you don't have to be cryptographically secure give me a simple one
Yeah so we could take all the bites
Baby with a rotation on everybody deckstorm together and when we're done we might have a bite or 16 bit
And the nice property that check some is if any bits get garbled or not too many of them in the main body and you try
Try to compute the checksum again the two checks I'm the one that you started within the one you got at the reception don't match and you figure there's a problem
Very simple
What gets
More tricky is if you got somebody who's actively trying to defeat your check some and that's where you want to have
Cryptographic hatches and I'll mention that in a week or two
Suffice it to say that the computation to compute the check sums
More complicated in a way that's very hard for somebody to fake
Okay but that simple checksum I just gave you where you take the bites and ux arm together is a good way to think about this to start with
Now
Sander waits for a while if it doesn't receive an acknowledgement retransmit so this is a very simple protocol and it worked
Pretty well
And how do we handle Collision well if two senators try to send at the same time you end up with a garbled checksum so you then
You try to resend later
Now
Here's where things get a little dicey
Okay so I've detected Collision now what
Yeah
Yeah so the important thing is if I run into you in the hall couch
And then we back up
And then we walk again it exactly the same time and we run into each other in the hall again
Ouch
Nobody gets through the hallway okay so we got to add something to the system to make sure that people don't retry it exactly
The same time
And that's where Randomness is important
Okay so all of these product protocols have this property of needing to back off
There's a different property which is cut into it randomly so that's important there's a different property which is kind of interesting here which is
There's a sort of a bad positive feedback loop
Once you got a collision
People start resending
Which puts more traffic in the system which causes more Collision which causes more resending
Okay so
Problem
Right so
More people that talk the more collisions there are the more retransmission the less likely people get a message through
So this sounds like ass this particular type of protocol hits a limit pretty quick quickly in
What it can do with the deal with okay
The other problem
Is
Also unfortunate I started talking in the clear and then you just started talking over me
Okay not a problem with that is it maybe that I got 1/2 or 3/4 or you know a large fraction of the
Message through
But the fact that you talked over the very end of it means that it's going to look like a bad transmission in the best I can do is retransmitted
In case of the fact
That I just start talking even if somebody has already been successfully talking
Is kind of a bad
Consequences
Okay it's just not well design from that standpoint
So all of these sort of problems led to
The csma-cd detection algorithm which is exactly what's used in Ethernet
Okay
Carrier sense multiple access Collision detection
And so this was first showed up in the early 80s
And it's practical local area network still used and its various forms today
And so whenever you plug your laptop in your pretty much using an ethernet connection
And uses wire instead of radio but it was still a broadcast network and as I mentioned with the
The 10-day ski Connections in in the offices
It's very much a broadcast network
So the key advance to make this work was this csma-cd which is Carrier sense multiple action
Multiple access Collision detection
And the carrier sense is
Don't start talking
Unless there's no carrier
Which is basically I listen first if there's nobody speaking then I talk
Collision detection is exactly what we said before if I
I listen to my package go through and if I hear them be trampled I know that there's a collision and I need to abort
Retry
And how do we have a good backup scheme well this is where we have to introduce the right a level of randomness
And
So obviously we don't wait the same amount of time we got to put Randomness in there and so ultimately what we had was
Was it adaptive randomized waiting strategy which I'll show you
Is the following
First time you pick a random wait time with some initial mean
If you get another Collision you pick
A random wait time with twice the mean than four times the mean than 8 x 2 mean
So it's an exponential back off
K
The randomness is very important to decoupling the sender and the back off the sex potential let's the media
Adapt to the number of Sanders
Why should that be so why does exponential back off
Allow me to add app to the number of Senators in the system
Yes
Yeah so we're sacrificing throughput for the Billy get messages through I will agree with that
And notice if I exponentially backing off what I'm doing is I'm reducing my own rate
So that they aggregate rate of all of the sender's that happen to be in there can get closer to 100%
Rights if I have for Sanders I want to back off more than if I only have two centers
Okay that's just because there's four times as many messages or twice as many messages in the second case
Okay
Questions
Alright so what's this about our break time let's take about a three-year
4 minute break and come back and continue
