Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Values of x-variable used in effectplot

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-19-2019 12:24 PM
(972 views)

Hello..

I'm running multiple multivariable linear regression models (same set of covariates, changing primary predictor) using proc glm and then using the effectplot command in proc plm to plot the models. I would like to also have a single plot with all models overlaid on it. After reading through posts here, I figured that the best option for me was to output the predicted data from within the proc plm command , merge all of them and then try to plot them using sgplot. I'm running my glm on an N of 90 and have 10 variables in the model- 2 are continuous, the rest are categorical. There are no missing data in this file.

Things were moving reasonably well till I looked at the data output from the plm command. The data output from proc plm 'FitPlot' has an N of 200. I recognize the first and last values of my primary predictor X variable. Based on some of the posts here, I think that the procedure is somehow taking a set of values from the X-variable and using them in the model to predict the outcome- Is this the case? If so, does this somehow dictate the increase in N from the original data?

I feel like I understand why the actual x values are not used (but would like to confirm this)- the resulting prediction would a be a 'jittered' scatter of points and would produce a jagged line. Is this so?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

When you create an effect plot for a continuous variable, SAS procedures evaluate the regression model on an evenly spaced grid for the range of the X variable (ph1, I guess). By default, I think 201 points are used, but you say 200, so I might be wrong.

When you overlay the predicted values, each model (ph1, ph2, etc) will have 200 (or so) points,

Why? Because you didn't provide a SCORE data set, so that procedure assume you want to score on the range of the data.

It is not related to avoiding a "jittered scatter of points" or a "jagged line."

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You may get a better answer if you can show the code for one of the regressions.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for comment, ballardw. The code is below. This works- I get the plot that I want. My confusion is on the difference in the N that the glm model is running (N=90) and the N for the data that plm outputs (n=200)

proc glm data=finaldata plots=(diagnostics residuals(smooth));

class year (ref="2011") gender(ref="F") site(ref="DU") bmi_cat (ref="Normal") matrace (ref="NH White") education (ref="Bachelors or higher") income (ref="100- <200 K") parity(ref="1");

model totscore= ph1 site bmi_cat matrace education income matage gender year parity/solution CLPARM;

store ph1pred;

run;

quit;

proc plm restore= ph1pred;

effectplot fit(x= ph1) / at(gender="F") at(site="DU") at(matrace="NH White") at(bmi_cat="Normal") at(education="Bachelors or higher") at(income="100- <200 K") at(parity="1") at(year="2011");

ods output FitPlot= ph1pred;

run;

data ph1pred;

set ph1pred (keep= _XCONT1 _PREDICTED);

rename _XCONT1= ph1;

rename _PREDICTED= totscore;

run;

....... running the same model 14 times with different 'ph' variables and producing output data, I merged them to get a data set with all predicted data. Then plot as below:

proc sgplot data= finalpred;

series x=ph1 y=totscore1 ;

.

.

.

series x=ph14 y=totscore14;

yaxis grid values=(1.5 to 4.5 by .5);

xaxis label="ph";

yaxis label="Predicted score";

title "Predicted plot of ph1-p14";

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

When you create an effect plot for a continuous variable, SAS procedures evaluate the regression model on an evenly spaced grid for the range of the X variable (ph1, I guess). By default, I think 201 points are used, but you say 200, so I might be wrong.

When you overlay the predicted values, each model (ph1, ph2, etc) will have 200 (or so) points,

Why? Because you didn't provide a SCORE data set, so that procedure assume you want to score on the range of the data.

It is not related to avoiding a "jittered scatter of points" or a "jagged line."

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you Rick, that makes sense to me.

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.