Hello, I’m Brian Gaines, a
developer in the Advanced
Analytics division
in R&D at SAS.
In this video, I am
going to show you
how you can use the
tasks in SAS Studio
to quickly develop
code for estimating
a fully connected neural network
with the power of SAS Viya.
The structure of
a neural network
is inspired by the
human brain, in that it
contains multiple
layers of neurons
and can learn complex
relationships through training.
The input layer contains
your data set of interest,
where each unit or
neuron corresponds
to a different variable or
feature in your data set.
To pass this information to
a neuron in the next layer,
the model takes a linear
combination of the input
variables and then uses
an activation function
to transform it, often
in a nonlinear way.
This process is intended
to resemble the firing
of a neuron in your brain.
After this is done
for each neuron
in the first hidden
layer, these new values
then become the inputs
for the next layer,
where this process is repeated.
Eventually, the neurons
in the last hidden layer
are combined and transformed
to generate the output.
As an example, suppose you
are interested in predicting
the disease progression
in diabetes patients using
several baseline factors.
We have data for 442 diabetes
patients and 10 predictor
variables, which include
demographic information
in addition to various
health measures.
I am using a small data set for
the sake of this demonstration,
but the process I
am showing you is
the same for much
larger data sets, which
you can handle with the
distributed computing
power of SAS Viya.
Regardless of how
large your data set is,
the goal here is to
build a model that
lets you predict disease
progression in diabetes
patients.
As I mentioned, I am going to
use the tasks in SAS Studio
to perform this
analysis with SAS Viya.
If you are unfamiliar
with SAS Studio,
I encourage you to visit
the SAS Studio product
page to learn more
about its array
of productivity-boosting
features.
After opening SAS Studio,
to access the tasks
you click the tasks
icon on the left
and then expand the SAS
tasks folder to display
the tasks developed by SAS.
As you can see, we’ve developed
tasks that cover a wide range
of analytical areas,
including many for SAS Viya.
Your ability to use these tasks
will depend on your licensing.
For this example, we are
interested in the SAS Viya
Supervised Learning category,
so let’s click to expand that
and then double click to
open the Neural Network task.
This task, as well as
the other Viya tasks,
requires a connection to a SAS
Cloud Analytic Services server,
which I have already made.
When you open a
task, on the left
you have the settings
and options for the task,
and at the bottom
there is a task console
that lists the items
you must address
before you can run the task.
On the right, you
have the work area,
which initially consists
of comments in the code
because so far the minimum
requirements for the task
have not been met.
The first thing you need
to specify is your dataset,
so let’s select the DIABETES
table that was already loaded
into memory.
When you’re building a
machine learning model,
it is a good practice to
use a validation set to tune
the model and a test set to
assess how the model might
perform on new data.
To do this, you first select
Validation data and Test data
and then specify the
proportion of your data
to be allocated to
those data sets.
Let’s use 0.3 for validation
set and 0.1 for test set.
For the random number seed,
let’s use today’s date.
Now that your data
are set up, you
need to assign roles
to the variables that
will be used in your model.
In this case, you're using
an interval target variable,
so let’s select that.
The target variable is the
disease progression metric,
which is labeled as y in the
dataset, so let’s select that.
For the interval inputs
or continuous predictors,
you want to include all
10 remaining variables
in your data set, so let’s
use this shortcut at the top
to select all 10 predictors
and then click OK.
After making that selection, you
can see that the code has been
automatically generated in the
code window since the task’s
requirements are now met.
You can see the target
statement with y as the target
variable in addition to an input
statement with the variables
that you just selected.
Generally, the
tasks are designed
so that you need to make
only a few selections to be
able to run the task,
but the tasks also
provide you with
a lot more options
to customize the analysis
For example, on the
OPTIONS tab, you
have the ability to specify
the network architecture.
You can add hidden layers
by clicking the Plus button
on the hidden layers table.
Notice that after you do that,
the code in the code window
automatically updates in real
time to reflect this change,
because there is now a second
hidden statement corresponding
to a second hidden layer.
This automatic code generation
can be a useful tool
for learning SAS syntax.
You can also specify
the number of neurons
in each hidden layer, and
you can choose between seven
different activation functions.
For now, let’s stick with
the defaults and click Submit
to run the task.
In the results, let’s scroll
down to see how the model
performed on the validation set.
Using the naive
network architecture,
the mean square error
on the validation set
is well over 5000, which is not
very good for this data set.
Although neural networks are
very flexible and powerful
models, one of the biggest
challenges with them
is specifying the
architecture and some
of the other
optimization options,
or the hyperparameters.
This can be painstakingly
done through a lot of trial
and error, but
fortunately SAS has
something called autotuning
that can help you with this.
Autotuning uses an intelligent
optimization-based approach
to help you find good values
for your hyperparameters.
At the top of the OPTIONS tab,
let’s select “automatically
tune selected options”
to enable autotuning.
You can tune the number of
hidden layers in the model
and the number of neurons
in each hidden layer.
One nice thing
about autotuning is
that you can use your domain
expertise and other information
specific to your problem
to impose constraints
that guide autotuning to
consider only certain values
for the hyperparameters.
For example, suppose
for some reason
you would like for
the final model
to have at least
2 hidden layers.
You could enforce this by
increasing the initial value
and lower bound to 2.
Let's also increase
the upper bound to 5.
Let’s use the default values
for the number of neurons per
layer.
In the Optimization section,
you have the ability
to tune the hyperparameters
that govern the L1 and L2
regularization terms
Let’s also stick with the
default values for those
options.
The Autotune settings section
contains various options that
let you control how the
autotuning is performed,
but again, let’s
use the defaults.
Since autotuning
trains the model
for many different combinations
of your tuning parameters,
this process can take
significantly longer
than training a single model.
Instead of waiting
for the code to run,
let’s jump ahead and look at the
results I saved when I did this
earlier.
Let’s scroll down to see
the autotuning results.
The second autotuning
table displays
the results using the
initial values for the tuning
parameters, in addition
to the 10 combinations
of the hyperparameters
that resulted
in the best mean square
error on the validation set.
For example, a
network architecture
with 9 neurons in the
first hidden layer,
14 in the second layer, and
a small value for the L2
regularization parameter
resulted in a validation error
of about 2600.
This is about half
of the error you saw
when using the initial model,
Autotuning is a
great feature that
makes it easier to build your
model by automatically finding
good values for your
hyperparameters.
As you can see, the
tasks in SAS Studio
enable you to quickly
and easily develop
code for building a traditional
neural network model.
For more information please
visit the SAS Visual Data
Mining and Machine Learning
product page and the SAS Studio
Task Reference Guide.
Thanks for watching.
