Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- How should I do the regression with partially missing data?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-05-2013 05:02 PM
(1845 views)

Suppose I have three variables: Y, X1 and X2, while both Y and X1 have 100 observations, but X2 only has, say, 30 observations.

I want to estimate an equatino as Y=X1*b1+X2*b2, while utilizing all the information I have, i.e., I do not want to discard the 70 observations with missing X2s. How am I supposed to write the code?

Can I write it in this way:

prco model data=yx1x2;

parameters b1 b2;

if x2=. then

eq1=y-x1*b1;

else

eq1=y-x1*b1-x2*b2;

fit eq1;

run;

Behind the scene, how does SAS process this equation, I mean, what is the algorithm ?

Thank you very much.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1. Can you impute your missing data? SAS has procedures for that.

2. Is your data missing at random or systematic and continuous or categorical? If categorical, can you include "Missing" as a category?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, kessler. You are right. That is how SAS does it behind the scene.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If you want to capture the fact that the average of missing x2 values might not be zero, you could try fitting your model this way :

**proc model data=yx1x2;**

**parameters b0 b1 b2 bz;**

**z = missing(x2);**

**if z then x2=0;**

**y = b0 + x1*b1 + x2*b2 + z*bz;**

**fit y;**

**run;**

Parameter **b0** will account for the overall intercept (you may remove it later if it is not significant) and **bz** will account for the average effect of missing x2 values.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, PG. That is a good way to work around the issue.

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.