Posted 09-10-2020 05:40 PM
(827 views)

I saw Proc Reg several regressions with missing values that talks about multiple `model`

s with some missing `y`

s. In short, `proc reg`

with multiple `model`

s excludes observations with either `y1`

or `y2`

missing from all of its `model`

s. My case is opposite—my `proc reg`

with multiple `model`

s has some `x`

observations missing as follows.

```
data have;
do i=1 to 5000;
x1=rannor(1);
x2=rannor(1);
x3=rannor(1);
y=x1+x2+x3+rannor(1);
if ranbin(1,1,0.01) then x1=.;
if ranbin(1,1,0.01) then x2=.;
if ranbin(1,1,0.01) then x3=.;
output;
end;
run;
```

And I want to make each `model`

in the following `proc reg`

use all available observations.

```
proc reg noprint outest=want;
model y=x1;
model y=x2;
model y=x3;
model y=x1 x2;
model y=x1 x3;
model y=x2 x3;
model y=x1 x2 x3/edf;
quit;
```

I cannot apply the method above here because my `model`

has different `x`

s rather than `y`

s. Is separating `proc reg`

s the only solution?

If by "separating" you mean a separate Proc reg call with a single model statement then pretty much yes.

From the documentation for Proc Reg:

PROC REG constructs only one crossproducts matrix for the variables in all regressions. If any variable needed for any regression is missing, the observation is excluded from all estimates. If you include variables with missing values in the VAR statement, the corresponding observations are excluded from all analyses, even if you never include the variables in a model. PROC REG assumes that you might want to include these variables after the first RUN statement and deletes observations with missing values.

