So the next big category of data
that we'll talk about briefly
here is graph data.
So graph data-- the classic
example, of course, is HTML,
is the worldwide web, is
graph that is defined by--
as a graph.
It's defined by nodes, which
are our vertices in our graph.
So every webpage
is a node, and then
a set of edges, which point
from one node to another.
And those edges can be
one-directional like here,
or they can be
bidirectional, here.
And then in addition
to edges and nodes,
edges, in some
graphs, have weight.
So in this case, this count
for-- if it's an HTML website,
this might be a count of the
number of times that website--
this website here links
to this website here.
So it links five times here,
but only two times here.
So when we're dealing
with graph data--
and we won't talk about
this in great detail,
because it's sort of
it's own sub-problem
that we don't have a
lot of time to cover,
but it's good to be aware of.
When you're dealing
with graph data,
you have to put a lot
of thought into how
you capture the relationships
between the nodes, how you
encode your edges and vertices.
We have to sort of--
you don't get the
same kind of neat,
you know, there are
n attributes that
represent-- that can be
represented by n columns,
right?
Each vertice can have
any number, anywhere
from 0 to and to an
infinite, theoretically,
number of edges
coming out of it.
So when you're analyzing,
doing that sort of analysis,
you have to handle
it differently.
The last big category
of data is ordered data.
Now, ordered data is data
which has some sort of--
where each data object has
to be ordered in some way.
So in the case of
a genomic sequence,
for instance, the
ordering of our ribosome
of our nucleic acids
here, GGTTCC, et cetera,
is important, right?
The fact that we
have GGTTCC here
is different than if we
had had CCTT and then GG.
Those are different-- those
are fundamentally different
sequences, so we
have to encode it
in some way that
preserves that ordering.
Another example, and sort
of your classic example
of ordered data is
spatial and temporal data.
So this little gif
here represents
the average monthly
temperature of land--
of both lands and oceans
over the course of a year.
So in this case, the
spatial aspect of the data
is important.
Where we are in
the world certainly
matters when we're
looking at a data object.
And in this case, if we
were getting this data,
every row in, say, a
database table might be--
might have a location
associated with it and a time,
and there is an implicit
ordering there, especially
to the time, but
also the location.
So when we're
handling ordered data,
we have to be very
careful about it.
And this is very important,
because time series, of course,
anytime you thinking about
doing any kind of sensing
in any kind of sensing
material or anything like that,
you get time series data.
It's the most common
type of ordered data,
and we'll talk during the
boot camp a lot about--
during the back half
of the boot camp,
especially, about how we
handle time series data.
