11-04-2014 12:20 AM
Wish you all well.
I wish to find out if the proportion of yes to problem is the same between time 0 and 1 given that they had used(used=1).
The data structure is as
id time used problem
1 0 0 1
1 1 1 0
2 0 . 0
2 1 1 0
My initial thought was
proc genmod data=tmp descending;
class id time used;
model problem=time used time*used / dist=bin link=identity;
repeated sub=id / type=ind;
lsmeans time*used / ilink cl diff;
Is this the right approach? Does lsmean give me the right proportion and the proportion difference? :smileyconfused:
Because I was reading this 46997 - Estimating the risk (proportion) difference for matched pairs data with binary response.
and it says "Unlike the default logit link function, the identity link does not ensure that the model produces valid probability estimates. Errors may be result when fitting such models depending on the model and the data."
Thank you for your help
11-04-2014 08:45 AM
I would certainly change from the identity to the logit as the link, especially if the overall incidence is near the extremes (<0.2 or >0.8).
11-05-2014 01:55 PM
Assuming that you want an estimate (point and/or confidence interval) of the difference in those two probabilities, then yes, that code is reasonable (though you don't need the ILINK option when the identity link is used). If you use the default logit link then the differences from the LSMEANS statement will be differences in log odds (equivalently, log odds ratios). See the following note for more, in general, on estimating differences in probabilities: