Solved: PROC GAMPL

Baltaf · Posted 07-07-2021 01:15 PM

Hi

I have two questions

1- I am reading the documentation of the procedure (PROC GAMPL) (https://support.sas.com/documentation/onlinedoc/stat/141/hpgam.pdf)

page (2918 ✦ Chapter 42: The GAMPL Procedure) Table (Output 42.2.8 Tests for Smoothing Components)

what is the difference between EDF for the fit and EDF for the test?

2- Page (Example 42.3: Nonparametric Negative Binomial Model for Mackerel Egg Density ✦ 2925 ) figure (Output 42.3.4 Smoothing Components Panel ) - how can I interpret the contour plot

3- Can I calculate the confusion matrix for the generalized additive model using PROC GAMPL

StatDave · Posted 07-07-2021 03:01 PM

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

View solution in original post

StatDave · Posted 07-07-2021 03:01 PM

See "Tests for Smoothing Components" and "Degrees of Freedom" in the Details section which explains the two degrees of freedom. The contour plot shows how the bivariate spline for Latitude and Longitude used in the model to predict the response changes as a function of Latitude and Longitude. When you ask about the confusion matrix, I assume you mean for a model on a binary response variable like in the second (logistic) example. You can do that by requesting the predicted probabilities of the event from the OUTPUT statement as shown in the second run of GAMPL in the second example. Using the output data set, create a predicted response categorical by comparing the predicted probability to whatever cutoff value you want to use. For example, if the cutoff is to be 0.5:

proc gampl data=Diabetesstudy seed=12345; 
   model result(event='1') = spline(Glucose)
                             spline(Pedigree) spline(Age) / dist=binary;
   output out=out;
   id Diabetes Test;
run;
data out; set out;
pd=(pred>.5);
run;
proc freq; 
table pd*diabetes; 
run;

PROC GAMPL

Re: PROC GAMPL

Re: PROC GAMPL

PROC GAMPL

Re: PROC GAMPL

Re: PROC GAMPL

The 2025 SAS Hackathon has begun!