The procedure PROC GAMMOD and the Generalized Additive Model (GAM) node in SAS Model Studio build generalized additive models. GAMs are able to fit non-normal, non-linear models. This makes them quite useful! This post will show you how easy they are to use in SAS Model Studio.
Example Use Case:
GAMs versus GLMs versus Linear Regression
Recall the assumptions of ordinary least squares simple linear regression models:
Linearity: The relationship between the outcome and the features is linear in the parameters
From Brian Gaines https://blogs.sas.com/content/subconsciousmusings/2022/03/24/accuracy-versus-interpretability-with-g...
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Normality: Errors follow a normal distribution
IID errors: Error terms are independent and identically distributed.
Homoscedasticity: Equal variance.
In most cases we do not have normality. We may have other distributions such as Poisson, negative binomial, Gamma, Tweedie, etc. But not to worry. In these cases we can use generalized linear models (GLMs) or generalized additive models (GAMs). Generalized Linear Models (GLMs) assume linearity, but generalized additive models (GAMs) do not. GAMs accomplish this using spline terms to generalize linear models’ assumption of linearity. Thus GAMs are even more generalized than GLMs!
Another perk of GAMs is that they have less bias than GLMs. However, as the number of observations increases, GAMs tend to converge more slowly than GLMs.
PROC GAMMOD versus PROC GAMPL
PROC GAMMOD (SAS Viya) is very similar to PROC GAMPL (SAS/STAT SAS 9 high performance) in features, options and results. PROC GAMMOD does provide additional functionality by supporting:
The link function also differs between the procedures. PROC GAMMOD uses a log link for both the Gamma and Inverse Gaussian distributions. PROC GAMPL uses the reciprocal for a Gamma distribution and the reciprocal square for Inverse Gaussian distributions, as shown in the table below.
PROC GAMMOD versus PROC GAM
PROC GAM (SAS/STAT SAS 9) supports Gaussian (normal), binomial, Poisson, Gamma and inverse Gaussian distributions. PROC GAMMOD (SAS Viya) supports all of these PLUS negative binomial and Tweedie distributions. In addition, PROC GAMMOD and PROC GAM work very differently. Do not expect results to be similar. See some of the differences detailed in the table below.
Source: https://go.documentation.sas.com/doc/en/pgmsascdc/default/casstat/casstat_gammod_overview02.htm
The Generalized Additive Model node became available in SAS Model Studio in 2020.1.4. GAM is now included in the Advanced Template for an Interval Target in SAS Model Studio.
Defaults for the GAM node are:
as shown in the screen capture below.
Other distributions available are Gamma, inverse Gaussian, negative binomial, Poisson and Tweedie.
The default boosting options are:
By default, smoothing plots are displayed as individual plots (up to 10 plots) and grouped as one report.
If you automatically generate a pipeline in SAS Model Studio, if will consider a GAM model.
You can, in fact, force it to include a GAM model in the pipeline, if you so desire.
Here is the resulting pipeline for my data (HEART data) when I forced it to include GAM.
With my data the GAM model did better than Forest or Linear Regression. However, Gradient Boosting and the Ensemble model did better than GAM.
Summary
Generalized additive models are useful when you have neither linearity nor normality. SAS Model Studio makes it very easy to add a GAM model using the GAM node. SAS Model Studio now includes GAM in its Advanced Template with an Interval Target. SAS Model Studio also considers GAM in its automatically generated pipeline if you have an interval target.
Because it is so easy to add a GAM model in SAS Model Studio, you can always include a GAM to see if it outperforms your other models.
For More Information
Find more articles from SAS Global Enablement and Learning here.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.