BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Baltaf
Calcite | Level 5

Hi 

I have two questions

1- I am reading the documentation of the procedure (PROC GAMPL) (https://support.sas.com/documentation/onlinedoc/stat/141/hpgam.pdf)

page (2918 ✦ Chapter 42: The GAMPL Procedure)    Table (Output 42.2.8 Tests for Smoothing Components)

what is the difference between EDF for the fit and EDF for the test? 

 

2- Page (Example 42.3: Nonparametric Negative Binomial Model for Mackerel Egg Density ✦ 2925 ) figure (Output 42.3.4 Smoothing Components Panel ) - how can I interpret the contour plot 

 

 

3- Can I calculate the confusion matrix for the generalized additive model using PROC GAMPL

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

View solution in original post

1 REPLY 1
StatDave
SAS Super FREQ

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 811 views
  • 1 like
  • 2 in conversation