Hi there, I am Matt Williams, and I am one
of the evangelists here at Datadog.
Welcome to the next video in our series called
Datadog 101.
In the previous videos, we looked at working
with the different components of the application
and bringing data into datadog with the agent
and integrations.
In this video, we look at the different ways
you can make use of the data you are collecting.
I cover creating dashboards, the graphing
options available, how to share dashboards,
and much more.
In the first video, I talk briefly about the
two types of dashboards,
screen boards, and time boards.
The first dashboard we will create is a time
board.
The most common type of visualization is a
Time Series Line Graph.
Just drag and drop the Time series widget
onto the dashboard to add it.
Timeboards have a very specific structure,
so you are only able to add it to the grid
of graphs.
If you want to have more flexibility with
the placement of graphs, you will love the
screen boards that we cover later.
So a time series line graph displays a line
for each group of data.
By default, all hosts that report the metric
are combined into a single line.
You can choose any metric collected by your
organization here, as long as the metric has
seen new data in the last 24 hours.
The next dropdown lets you select the subset
of hosts from which to collect data.
In a previous video, you learned about tagging
agents and integrations.
One of the tags I mentioned was role:database.
Here you could choose that same tag and see
metrics only from the hosts running a database.
Next, you can choose whether you want to show
averages, maximums, minimums, or sums of the
values of metrics.
This selection determines how you are going
to aggregate data.
I will go into more detail about aggregation
in an upcoming video
After that choose how you want to group metrics.
The dropdown lets you choose any of the tags
that have been assigned, either automatically
or by you in configuration files.
For instance, choosing host will give you
a line for each host.
To the right of the grouping, there is a plus
sign where you can add other functions.
I go into all of the math functions in a separate
video later on in this series.
Below that you can change the styling of the
graph lines.
I will go over the main styles soon like line
vs. bar vs. area, but you can also choose
the color scheme used, the line style and
the stroke thickness.
Sometimes you want to show multiple metrics
on a single graph, or combine metrics mathematically.
To add another metric, click the Add Metric
button below.
If you click the Advanced link to the right,
you can combine metrics using a variety of
mathematical functions.
It can be useful to overlay events on top
of the graph as well.
If this were a graph of CPU utilization of
Docker hosts, then overlaying Docker containers
as they are created and destroyed could be
helpful.
Next to that you can add a marker.
Markers are great to see exactly where the
ideal range is for a metric.
When looking at a large collection of graphs,
it can be easier to understand when something
is good or bad with these markers, especially
when it's on a big screen on the wall.
If you do not title the graph, the system
assigns one to it.
However, it is almost always better to give
the graph a name that makes more sense in
your environment.
Appropriate graph names are useful for those
who don’t spend as much time looking at
your dashboards.
Before we leave this editor, let's take a
look at three other tabs and links.
Immediately to the left of the Edit tab is
JSON.
When you have a bit more expertise in the
product, take a look at creating graphs using
the JSON editor where you have access to features
not available in the GUI like Y axis configuration.
To learn more about those options, see the
Graph Primer link to the left.
In between JSON and the Graph Primer link
is the Share tab.
This section allows you to create an embed
code to insert this graph into any web page
that accepts an iframe.
Since the embedded version does not inherit
any timeframe and sizing data from the parent
page, you need to set those here.
When you complete making your changes, click
on Save and then create your next graph.
As you are building out your time boards,
you may wonder when one visualization is better
than another.
When is the best time to use a line graph
like the one I just created?
Line graphs are great when reporting a single
metric from different scopes because you can
easily spot outliers.
Metrics from a single source or an aggregate
are also great because you can easily see
the evolution over time.
Related metrics with the same units are also
interesting so that you can spot correlations.
When you use the markers, identifying a clear
acceptable domain is a benefit of Line Graphs.
But when you start reporting from a lot of
different hosts individually and the metrics
are highly variable, the line graph can get
messy.
Related to the Line Graph is the stacked area
graph.
In fact, this is achieved just by changing
the style of the Line Graph.
Stacked Areas are great when you need to see
the sum as well as the contribution of metrics
in a single graph.
For example comparing load balancer requests
per AZ or when looking at CPU total, user,
system, idle, and stolen.
One more style of graph available for time
series graphs is the bar graph.
Just choose to display bars instead of lines
or areas.
Bars are excellent for representing counts.
Unlike gauge metrics, which represent an instantaneous
value, count metrics only make sense when
paired with a time interval (e.g., 13 server
errors in the past five minutes).
Bar graphs require no interpolation to connect
one interval to the next, making them especially
useful for representing sparse metrics.
Like area graphs, they naturally accommodate
stacking and summing of metrics.
These are great when showing sparse metrics
like the number of blocked tasks in Cassandra's
internal queues or counts like failed jobs
by data center in 4-hour intervals.
Heat maps are also available in Timeboards.
Heat maps show the distribution of values
for a metric evolving over time.
Each column in the chart represents a distribution
of values during a particular time slice.
Each cell's shading in that column corresponds
to the number of entities reporting that value
during that period.
They are great to use when reporting a single
metric by a large number of groups, such as
showing web latency when you have a large
number of web servers.
Another type of graph is the distribution.
Distribution graphs show a histogram of a
metric's value across a segment of your infrastructure.
Each bar represents a range of binned values,
and its height corresponds to the number of
entities reporting values in that range.
Distribution graphs are closely related to
heat maps.
The key difference between the two is that
heat maps show change over time, whereas distributions
are a summary of a time window.
Like heat maps, distributions visualize large
numbers of entities reporting a particular
metric, so they are often used to graph metrics
at the individual host or container level.
Distribution graphs are great if you want
to convey general health or status at a glance
or to see variations across members of a group.
Toplists are ordered lists that allow you
to rank hosts, clusters, or any other segment
of your infrastructure by their metric values.
Because they are so easy to interpret, toplists
are useful in high-level status boards.
Compared to single-value summaries, toplists
have an additional layer of aggregation across
space, in that the value of the metric query
is broken out by a group.
Each group can be a single host or an aggregation
of related hosts.
Toplists are perfect when you want to spot
outliers, underperformers, or resource hogs,
or to convey KPIs in an easy to read format.
Change graphs compare a metric’s current
value against its value at a point in the
past.
The key difference between change graphs and
other visualizations is that change graphs
take two different timeframes as parameters:
one for the size of the evaluation window
and one to set the lookback window.
Change graphs can be used to separate metric
trends from periodic baselines, like database
write throughput compared to the same time
last week.
We saw host maps in the first video, and you
can add a smaller version of those to your
dashboards as well.
Finally, there are query values.
These display the current value of a given
metric query, with conditional formatting
(such as a green/yellow/red background) to
convey whether or not the value is in the
expected range.
The value displayed by a single-value summary
need not represent an instantaneous measurement.
The widget can display the latest value reported,
or an aggregate computed from all query values
across the time window.
Earlier in this video, you saw that you could
overlay events on
any graph when you are configuring  it.
But you can also do this at analysis time.
If you have an idea of what might have caused
a problem, enter the source or a few words
about it at the top and the events that match
the search are displayed here on the left.
As you hover over each event, you see them
highlighted on the graphs on the right.
When you find something, zoom in to get more
detail.
When you zoom in on any graph, the entire
dashboard updates to reflect the new time
span.
You can use the forward and rewind buttons
at the top to jump ahead and back by the chosen
time period.
Or select one of the time periods from the
dropdown.
Now that you have found something interesting
annotate it to start the conversation with
your colleagues.
You can have that conversation in Hipchat,
Slack, VictorOps, PagerDuty,
or even our event stream.
I will go into more detail about these integrations
in the upcoming notification video.
Now let's switch over to the other dashboard
type, the screen board.
As I mentioned before, screen boards are excellent
general status boards and are perfect for
viewing the overall health and status of a
service or entire infrastructure.
There are a few visualizations available in
screen boards that are not in time boards.
To create a screen board, just create a new
dashboard and choose Screen board.
Screenboards share a lot of the visualizations
available to time boards.
However, the biggest differences are that
you can place and size the graphs anywhere
you like, and you can control the period shown
on a per graph basis.
There are a few visualizations that are only
available on screen boards.
The first of these is the event stream.
The screen board event stream is just like
the event stream page we saw in the first video.
The event timeline is just like the aggregated
section at the top of the event stream page.
Again, we saw that in the first video.
Alert graph and values show the current information
from any configured monitor or alert.
The check status illustrates the status for
any configured check.
The iframe widget lets you embed any iframe
from another web page into this dashboard.
Image, note, and free text are exactly what
they sound like, allowing you to show an image
and different types of text on your screen
board.
Now you know what all the dashboard widgets
do and how to create a dashboard.
One of the benefits of screen boards that
you do not get with time boards is the ability
to create a public URL of the page.
You can then share this URL with others.
The page is a read-only view and since it's
a screen board you can add descriptions to
the page describing how the end user should
interpret the board.
At this point, I want to go back to a more
advanced topic with regards to creating dashboards:
templating.
Let's say you want to have a dashboard that
shows basic stats about your instances,
and you also want a copy of the same dashboard
that focuses on the different roles that you
use, like the database, cache, and web server.
You can create a single dashboard and leverage
Template Variables.
To create a template variable, go to the gear
icon and choose to edit template variables.
Now enter a variable name, such as Role.
For Role Group, I choose the role tag because
I have assigned role:database and so forth
to each of my instances.
Since I want to default to all of them, I
can leave the default at *. In each chart,
change the From box to $role and click done.
Now whenever you choose a role, all graphs
that are configured this way update to show
only metrics from the selected role.
The next couple of videos go into more depth
about two other concepts important to building
great dashboards.
The first will look at the various functions
available.
The second will cover how we aggregate the
data you are looking at in each graph.
That brings us to the end of this video.
In future Datadog 101 videos, I cover the
functions, aggregation methods, monitoring
and maybe a few other things as we go along.
I hope you found this session useful, and
if you have any comments, please let us know.
Thanks so much for watching.
Goodbye.
