Solved: Values of x-variable used in effectplot

sas_epi · Posted 09-19-2019 12:24 PM

Hello..

I'm running multiple multivariable linear regression models (same set of covariates, changing primary predictor) using proc glm and then using the effectplot command in proc plm to plot the models. I would like to also have a single plot with all models overlaid on it. After reading through posts here, I figured that the best option for me was to output the predicted data from within the proc plm command , merge all of them and then try to plot them using sgplot. I'm running my glm on an N of 90 and have 10 variables in the model- 2 are continuous, the rest are categorical. There are no missing data in this file.

Things were moving reasonably well till I looked at the data output from the plm command. The data output from proc plm 'FitPlot' has an N of 200. I recognize the first and last values of my primary predictor X variable. Based on some of the posts here, I think that the procedure is somehow taking a set of values from the X-variable and using them in the model to predict the outcome- Is this the case? If so, does this somehow dictate the increase in N from the original data?

I feel like I understand why the actual x values are not used (but would like to confirm this)- the resulting prediction would a be a 'jittered' scatter of points and would produce a jagged line. Is this so?

Thank you!

Rick_SAS · Posted 09-22-2019 07:11 AM

When you create an effect plot for a continuous variable, SAS procedures evaluate the regression model on an evenly spaced grid for the range of the X variable (ph1, I guess). By default, I think 201 points are used, but you say 200, so I might be wrong.

When you overlay the predicted values, each model (ph1, ph2, etc) will have 200 (or so) points,

Why? Because you didn't provide a SCORE data set, so that procedure assume you want to score on the range of the data.

It is not related to avoiding a "jittered scatter of points" or a "jagged line."

View solution in original post

ballardw · Posted 09-19-2019 05:28 PM

You may get a better answer if you can show the code for one of the regressions.

sas_epi · Posted 09-19-2019 06:02 PM

Thank you for comment, ballardw. The code is below. This works- I get the plot that I want. My confusion is on the difference in the N that the glm model is running (N=90) and the N for the data that plm outputs (n=200)

proc glm data=finaldata plots=(diagnostics residuals(smooth));
class year (ref="2011") gender(ref="F") site(ref="DU") bmi_cat (ref="Normal") matrace (ref="NH White") education (ref="Bachelors or higher") income (ref="100- <200 K") parity(ref="1");
model totscore= ph1 site bmi_cat matrace education income matage gender year parity/solution CLPARM;
store ph1pred;
run;
quit;

proc plm restore= ph1pred;
effectplot fit(x= ph1) / at(gender="F") at(site="DU") at(matrace="NH White") at(bmi_cat="Normal") at(education="Bachelors or higher") at(income="100- <200 K") at(parity="1") at(year="2011");
ods output FitPlot= ph1pred;
run;

data ph1pred;
set ph1pred (keep= _XCONT1 _PREDICTED);
rename _XCONT1= ph1;
rename _PREDICTED= totscore;
run;

....... running the same model 14 times with different 'ph' variables and producing output data, I merged them to get a data set with all predicted data. Then plot as below:

proc sgplot data= finalpred;
series x=ph1 y=totscore1 ;
.

.

.
series x=ph14 y=totscore14;

yaxis grid values=(1.5 to 4.5 by .5);
xaxis label="ph";
yaxis label="Predicted score";
title "Predicted plot of ph1-p14";
run;

Rick_SAS · Posted 09-22-2019 07:11 AM

When you create an effect plot for a continuous variable, SAS procedures evaluate the regression model on an evenly spaced grid for the range of the X variable (ph1, I guess). By default, I think 201 points are used, but you say 200, so I might be wrong.

When you overlay the predicted values, each model (ph1, ph2, etc) will have 200 (or so) points,

Why? Because you didn't provide a SCORE data set, so that procedure assume you want to score on the range of the data.

It is not related to avoiding a "jittered scatter of points" or a "jagged line."

sas_epi · Posted 09-22-2019 07:53 AM

Thank you Rick, that makes sense to me.

Values of x-variable used in effectplot

Re: Values of x-variable used in effectplot

Re: Values of x-variable used in effectplot

Re: Values of x-variable used in effectplot

Re: Values of x-variable used in effectplot

Re: Values of x-variable used in effectplot

SAS Innovate 2026 Registration is Open