BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Baltaf
Calcite | Level 5

Hi 

I have two questions

1- I am reading the documentation of the procedure (PROC GAMPL) (https://support.sas.com/documentation/onlinedoc/stat/141/hpgam.pdf)

page (2918 ✦ Chapter 42: The GAMPL Procedure)    Table (Output 42.2.8 Tests for Smoothing Components)

what is the difference between EDF for the fit and EDF for the test? 

 

2- Page (Example 42.3: Nonparametric Negative Binomial Model for Mackerel Egg Density ✦ 2925 ) figure (Output 42.3.4 Smoothing Components Panel ) - how can I interpret the contour plot 

 

 

3- Can I calculate the confusion matrix for the generalized additive model using PROC GAMPL

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

View solution in original post

1 REPLY 1
StatDave
SAS Super FREQ

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 742 views
  • 1 like
  • 2 in conversation