- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi there!
I'm fairly new SAS and I'm trying to run some regressions using proc glm in Enterprise Guide.
I want to run a basic OLS linear regression. The reason I'm using proc glm instead proc reg is so that I can use class variables. I read that proc reg does not support this.
Say I have a sample with 2000 observations, and I want to estimate a series of coeffecients for all the independant variables. So far I'm all good with the following lines of code:
------------------
proc glm data=WORK.INPUT PLOTS=ALL; | |
where group=1 AND NB=0; | |
class C D; | |
model X= A B C D/ | |
solution; | |
output out=WORK.TEST p=yhat r=resid; | |
run;
-------------
Now I have another dataset with an additional 20 000 observations. They all include the independant variables A - D, but lack the dependant X.
How do I predict X in this dataset, using the coefficients from the above stated regression?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PGs example is known as the "missing response trick": The missing value trick for scoring a regression model - The DO Loop
For other ways to score a data set, see Techniques for scoring a regression model in SAS - The DO Loop
For your example, I'd use the STORE statement followed by the PLM procedure.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One way is to append your additional observations to your input dataset and give them a frequency of zero (that way, even if they included dependant values, additional observations would be excluded from the regression)
data FULL / view=FULL;
set INPUT (in=inInput) ADDITIONAL;
where group=1 AND NB=0;
freq = inInput;
run;
proc glm data=FULL PLOTS=ALL;
class C D;
freq freq;
model X= A B C D/ solution;
output out=TEST(where=(not freq)) p=yhat r=resid;
run;
(Untested)
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PGs example is known as the "missing response trick": The missing value trick for scoring a regression model - The DO Loop
For other ways to score a data set, see Techniques for scoring a regression model in SAS - The DO Loop
For your example, I'd use the STORE statement followed by the PLM procedure.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you both!
The STORE and PLM procedure is exactly what I was looking for. I found your blog post very useful Rick - thanks again!