Help using Base SAS procedures

Out of sample predictions with PROC GLM

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Out of sample predictions with PROC GLM

Hi there!

I'm fairly new SAS and I'm trying to run some regressions using proc glm in Enterprise Guide.

I want to run a basic OLS linear regression. The reason I'm using proc glm instead proc reg is so that I can use class variables. I read that proc reg does not support this.

Say I have a sample with 2000 observations, and I want to estimate a series of coeffecients for all the independant variables. So far I'm all good with the following lines of code:

------------------

proc glm data=WORK.INPUT PLOTS=ALL;
where group=1 AND NB=0;

class C D;
model X= A B C D/
solution;
output out=WORK.TEST p=yhat r=resid;

run;

-------------

Now I have another dataset with an additional 20 000 observations. They all include the independant variables A - D, but lack the dependant X.

How do I predict X in this dataset, using the coefficients from the above stated regression?


Accepted Solutions
Solution
‎02-19-2014 10:08 AM
SAS Super FREQ
Posts: 3,753

Re: Out of sample predictions with PROC GLM

PGs example is known as the "missing response trick": The missing value trick for scoring a regression model - The DO Loop

For other ways to score a data set, see Techniques for scoring a regression model in SAS - The DO Loop

For your example, I'd use the STORE statement followed by the PLM procedure.

View solution in original post


All Replies
Respected Advisor
Posts: 4,926

Re: Out of sample predictions with PROC GLM

One way is to append your additional observations to your input dataset and give them a frequency of zero (that way, even if they included dependant values, additional observations would be excluded from the regression)

data FULL / view=FULL;

set INPUT (in=inInput) ADDITIONAL;

where group=1 AND NB=0;

freq = inInput;

run;

proc glm data=FULL PLOTS=ALL;

class C D;

freq freq;

model X= A B C D/ solution;

output out=TEST(where=(not freq)) p=yhat r=resid;

run;

(Untested)

PG

PG
Solution
‎02-19-2014 10:08 AM
SAS Super FREQ
Posts: 3,753

Re: Out of sample predictions with PROC GLM

PGs example is known as the "missing response trick": The missing value trick for scoring a regression model - The DO Loop

For other ways to score a data set, see Techniques for scoring a regression model in SAS - The DO Loop

For your example, I'd use the STORE statement followed by the PLM procedure.

New Contributor
Posts: 2

Re: Out of sample predictions with PROC GLM

Thank you both!

The STORE and PLM procedure is exactly what I was looking for. I found your blog post very useful Rick - thanks again!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 1484 views
  • 3 likes
  • 3 in conversation