I'm going to talk about why
web links break
Now, I'm sure this is something that
everybody is familiar with and has annoyed everybody
If you're browsing the web and you're doing your clicky stuff
You follow a link, follow another link, follow another link
and then all of a sudden
it doesnt work, it breaks.
You get the dreaded 404 error message
It's at best an annoyance
but it can be a major problem
because in the case of old websites
that aren't continually being maintained and updated
it could make them completely unusable.
I've certainly, quite often, seen websites
that might have useful information on them
but none of the links work
everything gives you a 404 error
The reason it's 404 is that comes from
the protocol underpinning the World Wide Web
It's one of the error messages
and it just means that there's a missing file
At one level that might seem quite trivial
somebody has deleted the page you're pointed at
But information systems don't necessarily have to behave like that
If you are browsing through the files on your PC
using Windows Explorer or Mac Finder or whatever,
and you move a file
you don't suddenly start getting error messages
because you've moved a file
Where as you do on the Web
and this comes down to quite a fundamental
way in which the Web is designed
Really, the problem is
that links on the web
aren't actually links
at all, they're misnamed.
Because if we think about it for a moment
a link, the word link,
is a metaphor in design and it's taken from the links of a chain
and a link
implies that two things are actually attached, if you move one
you can't separate them
That's not the way the Web behaves
What you've actually got on the Web is not links at all
but pointers
You've got a pointer from one document to another
Or in fact you don't even have that
you have a pointer from one document to where you hope another is going to be
and that's why things break
Let's take a webpage. I'm going to just take my personal webpage
Webpages are identified by URLs
the URL is this
Now, if I want to put in a link into that page
I would just put in
'<a' which stands for anchor
because this is the anchoring end of the link
'href='
like that
and href just stands for hypertext reference
it's referencing this webpage
then I can just put in the text from the links so
'click here'
and just finish that
So that now is the HTML [code] to create a link
but, you can see what it is
it's a hypertext reference
to this webpage
There is no way that this webpage
has any knowledge of the fact that it's being referenced
If I put a link in to your webpage
Your webpage doesn't know about it
and neither do you unless I tell you
as far as the mechanics of the Web are concerned
it's purely
the href
it's a pointer
to where something
possibly is
because it's up to me when I put it in to actually get that right
and if it was right at the time
but then you changed the name of your document
then all of a sudden its wrong
This is why the Web is very brittle
it's why it can break
Now, I mean there are ways around that
but they're not particularly easy
a lot of the web these days isn't just HTML documents
a lot of the web is data driven and there's databases underpinning it
and if you're careful enough with the design of your system
you can have a system that manages its own links internally
that will be fairly careful not to break
but one thing that is quite hard to do
another thing is that
even if you do that
the moment you have a link that goes out to somebody elses' webpage
you want to link one of my pages or you want to link something on the BBC News site or what-have-you
then the moment that somebody else
outside of your control
changes things
it's gonna break
that is
a fundamental problem
to the way that the Web works
because any solution
is retro-fitted
to the way that the web works
and so I'm afraid although there are
very very annoying
404's are going to be with us for the foreseeable future
and they're an annoyance you have to live with
if you use the Web
people could create new material and new content by quoting original sources
and taking the original material and putting it into a new context
but an interesting part of this is Ted also envisioned a micro charging
