- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I'm trying to impute outcome 2 using regression method for a single imputation. I know that I have to run regression model like show in the code to get the coefficients to build my model. Do you know how to convert current regression model to a loop? so that values regressed are imputed in the missing slots?
Thanks a lot for your help in advance.
data support;
input id treatment gender age duration baseline outcome2 outcome4;
cards;
1 1 1 32.59 7.589041 4 -3 -3
2 1 2 37.51 13.50959 5 -1 -1
3 1 1 52.87 27.86849 6 . .
4 1 1 34.35 6.347945 3 -3 -3
5 1 2 30.13 5.131507 5 -4 -4
6 1 2 30.12 7.115068 5 . .
7 1 2 32.75 7.753425 9 -3 -3
8 1 1 30.9 7.89863 3 . .
9 1 1 31.09 6.087671 6 -6 -6
10 1 2 30.61 3.605479 4 -3.5 -3.5
11 1 1 28.93 3.926027 5 -3 -3
12 1 2 34.1 5.10137 3 -1 -1
13 0 1 30.33 2.334247 2 -1 -1
14 0 2 32.5 5.504109 5 -1 -1
15 0 2 32.27 8.273973 5 . .
16 0 2 36.73 11.73151 1 1 1
17 0 2 61.06 41.06301 8
;
proc reg data=support;
model outcome2 outcome4=treatment age gender duration baseline;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As PaigeMiller pointed out, I think you are using the wrong terminology. Outcome2 is a response variable, therefore you do not "impute" the values, you "predict" them by scoring the mode. For your example, the output data set contains predicted values for the response variables:
proc reg data=support plots=none;
model outcome2 outcome4=treatment age gender duration baseline;
output out=RegOut P=Pred2 Pred4;
quit;
proc print data=RegOut;
var ID OutCome2 Pred2 Outcome4 Pred4;
run;
If this is not what you want, please explain further.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A lot of deciding how to handle missing values depends on understanding the subject matter and understanding the goals of the analysis, none of which we know, and so the best way to handle the missings is usually up to you.
In your case, the missing values are the Y variables in the regression, and generally those are not imputed (normally you would only impute values for the x-variables when missing) and so these observations would not be used in the regression. But even so, if you want values for the Y variables, then see paragraph 1.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can I use outcome2 to predict outcome 4?
No. Just look at the data. There is no way that outcome2 can be used to impute values to replace the missings of outcome4.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As PaigeMiller pointed out, I think you are using the wrong terminology. Outcome2 is a response variable, therefore you do not "impute" the values, you "predict" them by scoring the mode. For your example, the output data set contains predicted values for the response variables:
proc reg data=support plots=none;
model outcome2 outcome4=treatment age gender duration baseline;
output out=RegOut P=Pred2 Pred4;
quit;
proc print data=RegOut;
var ID OutCome2 Pred2 Outcome4 Pred4;
run;
If this is not what you want, please explain further.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Actually, it did not occur to me that perhaps the question is how to predict outcome2 and outcome4 in this situation.
@Cruise, is that what you want, predictions of outcome2 and outcome4 based upon the fitted model (which would only use observations with no missing values)?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content