BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
va003
Fluorite | Level 6

Hello,

 

For running several different regressions in SAS, I know I can do this:

 

proc reg data=have outest=want noprint edf tableout;		
	model a = x y z;	
	model b = x y z;	
run;

However, now I come across a situation when my variables (e.g. "a" & "b") are filled with missing values. For each missing a value, SAS would omit the observation for all models, even if b value is not missing in that observation. This causes SAS to provide different coefficients than it would if I were to run each model separately (like this: )

proc reg data=have outest=want noprint edf tableout;		
	model a = x y z;		
run;
proc reg data=have outest=want noprint edf tableout;		
	model b = x y z;	
run;

Is there a way for me to not have to do that? i.e. Is there some syntax I can add to this proc reg so SAS would treat each of my models separately?

 

Thank you very much.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The doc says:

 

Missing Values

 

PROC REG constructs only one crossproducts matrix for the variables in all regressions. If any variable needed for any regressionis missing, the observation is excluded from all estimates. If you include variables with missing values in the VAR statement, the corresponding observations are excluded from all analyses, even if you never include the variables in a model.PROC REG assumes that you might want to include these variables after the first RUN statement and deletes observations withmissing values.

 

So...

 

data have2;
set have;
v = a;
var = "a";
output;
v = b;
var = "b";
output;
run

proc sort data=have2; by var; run;

proc reg data=have2 outest=want noprint edf tableout; 
by var;
model v = x y z;
run;

(untested)

PG

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

The doc says:

 

Missing Values

 

PROC REG constructs only one crossproducts matrix for the variables in all regressions. If any variable needed for any regressionis missing, the observation is excluded from all estimates. If you include variables with missing values in the VAR statement, the corresponding observations are excluded from all analyses, even if you never include the variables in a model.PROC REG assumes that you might want to include these variables after the first RUN statement and deletes observations withmissing values.

 

So...

 

data have2;
set have;
v = a;
var = "a";
output;
v = b;
var = "b";
output;
run

proc sort data=have2; by var; run;

proc reg data=have2 outest=want noprint edf tableout; 
by var;
model v = x y z;
run;

(untested)

PG
va003
Fluorite | Level 6

Hello PG,

 

Thank you for your response. I don't quite understand what you do there. In my example, there is only 1 a (aka "a") and 1 b (aka "b"), and I'm afraid I have confused you. Are you combining a and b values into 1 variable, calling it v, then run prog reg against it? That's not quite what I'm trying to do. The variables a and b are different. In some observations, there are missing a values while in some other observations, there are b missing values (there are certainly overlaps but that should be a factor to consider).  

PaigeMiller
Diamond | Level 26

The code by @PGStats is what you want, it produces a regression for A and a regression for B, and if A is missing, the observation is still used for regression B, and vice versa. He is not combining A and B into a single variable mathematically, he is performing a "trick" to allow you to achieve separate regressions with separate handling of missings, which is exactly what you asked for when you said "Is there some syntax I can add to this proc reg so SAS would treat each of my models separately?"

 

But, you have also created code, with the two different PROC REGs, which should do the same thing.

 

You state:

 

This causes SAS to provide different coefficients than it would if I were to run each model separately.

 

Different coefficients is what you get. The two different codes you provide for the regressions cannot (it is impossible in the presence of missing values) result in the same coefficients from both.

 

--
Paige Miller
va003
Fluorite | Level 6

Thank you very much for your explanation! It makes sense now! Sorry I'm still learning.

arthurcavila
Obsidian | Level 7

He is assigning the value of both a and b into v and giving a label "a" or "b" into a variable var, then asking to run the regression on v subseting the data by var value. It multiplies the number of lines you have on the data by the number of regressions you want.

 

Is there a particular reason why you need everything in a single PROC REG? You can write a macro to avoid repeating the text.

 

%macro myreg(var);
	proc reg data=have outest=want noprint edf tableout;		
		model &var = x y z;	
	run;
%mend;

%myreg(a)
%myreg(b)
...

 

 

PaigeMiller
Diamond | Level 26

@arthurcavila wrote:

Is there a particular reason why you need everything in a single PROC REG?

 




 

A very good question.

--
Paige Miller
va003
Fluorite | Level 6

No, I just try to avoid running the identical code for each repression that's why. Thank you.

PGStats
Opal | Level 21

You will best understand my suggestion by trying out the code and checking the printed output and the dataset output. Of course, you can also call prog reg for each variable, but you will get two output datasets that you will then have to combine. 

PG
va003
Fluorite | Level 6

Thank you. This is great help. You're the hero I need not the hero I deserve.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 4796 views
  • 5 likes
  • 4 in conversation