Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Proc Reg several regressions with missing values

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-05-2019 07:40 PM
(4098 views)

Hello,

For running several different regressions in SAS, I know I can do this:

```
proc reg data=have outest=want noprint edf tableout;
model a = x y z;
model b = x y z;
run;
```

However, now I come across a situation when my variables (e.g. "a" & "b") are filled with missing values. For each missing a value, SAS would omit the observation for all models, even if b value is not missing in that observation. This causes SAS to provide different coefficients than it would if I were to run each model separately (like this: )

```
proc reg data=have outest=want noprint edf tableout;
model a = x y z;
run;
proc reg data=have outest=want noprint edf tableout;
model b = x y z;
run;
```

Is there a way for me to not have to do that? i.e. Is there some syntax I can add to this proc reg so SAS would treat each of my models separately?

Thank you very much.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The doc says:

**Missing Values**

PROC REG constructs only one crossproducts matrix for the variables in all regressions. If any variable needed for any regressionis missing, the observation is excluded from all estimates. If you include variables with missing values in the VAR statement, the corresponding observations are excluded from all analyses, even if you never include the variables in a model.PROC REG assumes that you might want to include these variables after the first RUN statement and deletes observations withmissing values.

So...

```
data have2;
set have;
v = a;
var = "a";
output;
v = b;
var = "b";
output;
run
proc sort data=have2; by var; run;
proc reg data=have2 outest=want noprint edf tableout;
by var;
model v = x y z;
run;
```

(untested)

PG

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The doc says:

**Missing Values**

PROC REG constructs only one crossproducts matrix for the variables in all regressions. If any variable needed for any regressionis missing, the observation is excluded from all estimates. If you include variables with missing values in the VAR statement, the corresponding observations are excluded from all analyses, even if you never include the variables in a model.PROC REG assumes that you might want to include these variables after the first RUN statement and deletes observations withmissing values.

So...

```
data have2;
set have;
v = a;
var = "a";
output;
v = b;
var = "b";
output;
run
proc sort data=have2; by var; run;
proc reg data=have2 outest=want noprint edf tableout;
by var;
model v = x y z;
run;
```

(untested)

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello PG,

Thank you for your response. I don't quite understand what you do there. In my example, there is only 1 a (aka "a") and 1 b (aka "b"), and I'm afraid I have confused you. Are you combining a and b values into 1 variable, calling it v, then run prog reg against it? That's not quite what I'm trying to do. The variables a and b are different. In some observations, there are missing a values while in some other observations, there are b missing values (there are certainly overlaps but that should be a factor to consider).

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The code by @PGStats is what you want, it produces a regression for A and a regression for B, and if A is missing, the observation is still used for regression B, and *vice versa*. He is not combining A and B into a single variable mathematically, he is performing a "trick" to allow you to achieve separate regressions with separate handling of missings, which is exactly what you asked for when you said "Is there some syntax I can add to this proc reg so SAS would treat each of my models separately?"

But, you have also created code, with the two different PROC REGs, which should do the same thing.

You state:

This causes SAS to provide different coefficients than it would if I were to run each model separately.

Different coefficients is what you get. The two different codes you provide for the regressions cannot (it is impossible in the presence of missing values) result in the same coefficients from both.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much for your explanation! It makes sense now! Sorry I'm still learning.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

He is assigning the value of both a and b into v and giving a label "a" or "b" into a variable var, then asking to run the regression on v subseting the data by var value. It multiplies the number of lines you have on the data by the number of regressions you want.

Is there a particular reason why you need everything in a single PROC REG? You can write a macro to avoid repeating the text.

%macro myreg(var); proc reg data=have outest=want noprint edf tableout; model &var = x y z; run; %mend; %myreg(a) %myreg(b) ...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@arthurcavila wrote:

Is there a particular reason why you need everything in a single PROC REG?

A very good question.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

No, I just try to avoid running the identical code for each repression that's why. Thank you.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You will best understand my suggestion by trying out the code and checking the printed output and the dataset output. Of course, you can also call prog reg for each variable, but you will get two output datasets that you will then have to combine.

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you. This is great help. You're the hero I need not the hero I deserve.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.