Is it the random residuals or random intercepts that are supposed to be used to calculate the predicted probabilities? Are you certain you have repeated measures?
- https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/2179-2018.pdf
/* "The following SAS DATA step inputs the respiratory data and creates an observation for each response. The baseline and follow-up responses are actually measured on a five-point scale, from terrible to excellent, and this ordinal response is analyzed later in the chapter. For this analysis, the dichotomous outcome of whether the patient experienced good or excellent response
is analyzed with a logistic regression. The second DATA step creates the SAS data set RESP2 and computes response variable DICHOT and dichotomous baseline variable DI_BASE. Note that the baseline variable, which was recorded on a five-point scale, could be managed as either ordinal or
dichotomous." - pg 515. Stokes, Davis, and Koch. Categorical Data Analysis Using SAS, Third Edition.
https://tinyurl.com/3rj67wad
data source: https://support.sas.com/rnd/app/stat/cat/edition3/samples/chapter15.html
*/
data resp;
input center id treatment $ sex $ age baseline
visit1-visit4 @@;
visit=1; outcome=visit1; output;
visit=2; outcome=visit2; output;
visit=3; outcome=visit3; output;
visit=4; outcome=visit4; output;
datalines;
1 53 A F 32 1 2 2 4 2 2 30 A F 37 1 3 4 4 4
1 18 A F 47 2 2 3 4 4 2 52 A F 39 2 3 4 4 4
1 54 A M 11 4 4 4 4 2 2 23 A F 60 4 4 3 3 4
1 12 A M 14 2 3 3 3 2 2 54 A F 63 4 4 4 4 4
1 51 A M 15 0 2 3 3 3 2 12 A M 13 4 4 4 4 4
1 20 A M 20 3 3 2 3 1 2 10 A M 14 1 4 4 4 4
1 16 A M 22 1 2 2 2 3 2 27 A M 19 3 3 2 3 3
1 50 A M 22 2 1 3 4 4 2 16 A M 20 2 4 4 4 3
1 3 A M 23 3 3 4 4 3 2 47 A M 20 2 1 1 0 0
1 32 A M 23 2 3 4 4 4 2 29 A M 21 3 3 4 4 4
1 56 A M 25 2 3 3 2 3 2 20 A M 24 4 4 4 4 4
1 35 A M 26 1 2 2 3 2 2 2 A M 25 3 4 3 3 1
1 26 A M 26 2 2 2 2 2 2 15 A M 25 3 4 4 3 3
1 21 A M 26 2 4 1 4 2 2 25 A M 25 2 2 4 4 4
1 8 A M 28 1 2 2 1 2 2 9 A M 26 2 3 4 4 4
1 30 A M 28 0 0 1 2 1 2 49 A M 28 2 3 2 2 1
1 33 A M 30 3 3 4 4 2 2 55 A M 31 4 4 4 4 4
1 11 A M 30 3 4 4 4 3 2 43 A M 34 2 4 4 2 4
1 42 A M 31 1 2 3 1 1 2 26 A M 35 4 4 4 4 4
1 9 A M 31 3 3 4 4 4 2 14 A M 37 4 3 2 2 4
1 37 A M 31 0 2 3 2 1 2 36 A M 41 3 4 4 3 4
1 23 A M 32 3 4 4 3 3 2 51 A M 43 3 3 4 4 2
1 6 A M 34 1 1 2 1 1 2 37 A M 52 1 2 1 2 2
1 22 A M 46 4 3 4 3 4 2 19 A M 55 4 4 4 4 4
1 24 A M 48 2 3 2 0 2 2 32 A M 55 2 2 3 3 1
1 38 A M 50 2 2 2 2 2 2 3 A M 58 4 4 4 4 4
1 48 A M 57 3 3 4 3 4 2 53 A M 68 2 3 3 3 4
1 5 P F 13 4 4 4 4 4 2 28 P F 31 3 4 4 4 4
1 19 P F 31 2 1 0 2 2 2 5 P F 32 3 2 2 3 4
1 25 P F 35 1 0 0 0 0 2 21 P F 36 3 3 2 1 3
1 28 P F 36 2 3 3 2 2 2 50 P F 38 1 2 0 0 0
1 36 P F 45 2 2 2 2 1 2 1 P F 39 1 2 1 1 2
1 43 P M 13 3 4 4 4 4 2 48 P F 39 3 2 3 0 0
1 41 P M 14 2 2 1 2 3 2 7 P F 44 3 4 4 4 4
1 34 P M 15 2 2 3 3 2 2 38 P F 47 2 3 3 2 3
1 29 P M 19 2 3 3 0 0 2 8 P F 48 2 2 1 0 0
1 15 P M 20 4 4 4 4 4 2 11 P F 48 2 2 2 2 2
1 13 P M 23 3 3 1 1 1 2 4 P F 51 3 4 2 4 4
1 27 P M 23 4 4 2 4 4 2 17 P F 58 1 4 2 2 0
1 55 P M 24 3 4 4 4 3 2 39 P M 11 3 4 4 4 4
1 17 P M 25 1 1 2 2 2 2 40 P M 14 2 1 2 3 2
1 45 P M 26 2 4 2 4 3 2 24 P M 15 3 2 2 3 3
1 40 P M 26 1 2 1 2 2 2 41 P M 15 4 3 3 3 4
1 44 P M 27 1 2 2 1 2 2 33 P M 19 4 2 2 3 3
1 49 P M 27 3 3 4 3 3 2 13 P M 20 1 4 4 4 4
1 39 P M 28 2 1 1 1 1 2 34 P M 20 3 2 4 4 4
1 2 P M 28 2 0 0 0 0 2 45 P M 33 3 3 3 2 3
1 14 P M 30 1 0 0 0 0 2 22 P M 36 2 4 3 3 4
1 10 P M 37 3 2 3 3 2 2 18 P M 38 4 3 0 0 0
1 31 P M 37 1 0 0 0 0 2 35 P M 42 3 2 2 2 2
1 7 P M 43 2 3 2 4 4 2 44 P M 43 2 1 0 0 0
1 52 P M 43 1 1 1 3 2 2 6 P M 45 3 4 2 1 2
1 4 P M 44 3 4 3 4 2 2 46 P M 48 4 4 0 0 0
1 1 P M 46 2 2 2 2 2 2 31 P M 52 2 3 4 3 4
1 46 P M 49 2 2 2 2 2 2 42 P M 66 3 3 3 4 4
1 47 P M 63 2 2 2 2 2
;
data resp2; set resp;
dichot=(outcome=3 or outcome=4);
di_base = (baseline=3 or baseline=4);
run;
/* Kathleen Kiernan. https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/2179-2018.pdf
"The data for this example is from a clinical trial (Stokes, Davis, and Koch 2012) that was conducted to compare two treatments for a respiratory illness. Patients in each of two centers were randomly assigned to two groups: one group received the active treatment and one group received a placebo.
During treatment, respiratory status was determined for each of four visits and is represented by the variable OUTCOME (coded here as 0=poor, 1=good). The variables CENTER, TREATMENT, SEX, and BASELINE (baseline respiratory status) are classification variables that have two levels. The variable AGE (age at time of entry into the study) is a continuous variable. The variable ID is the patient
identification number.
The following statements fit the model:"
*/
/* Marginal GEE type of model */
proc glimmix data=resp2 empirical method=rspl;
class id sex treatment visit;
model dichot (event='1')=sex treatment visit age baseline / dist=binary link=logit;
random _residual_ / subject=id type=cs;
output out=gmxout_id pred=xbeta_id pred(ilink)=predprob_id pred(ilink
noblup)=fix_predprob_id;
run;
/* with random residual, clustering by ID */
proc logistic data=gmxout_id plots(only)=roc;
model dichot = predprob_id;
ods select roccurve;
run;
/* C-statistic = 0.79 */
proc glimmix data=resp2 empirical method=rspl;
class center sex treatment visit;
model dichot (event='1')=sex treatment visit age baseline / dist=binary link=logit;
random _residual_ / subject=center type=cs;
output out=gmxout_center pred=xbeta_center pred(ilink)=predprob_center pred(ilink
noblup)=fix_predprob_center;
run;
/* with random residual, clustering by center */
proc logistic data=gmxout_center plots(only)=roc;
model dichot = predprob_center;
ods select roccurve;
run;
/* C-statistic = 0.79 */
/* nofit option still produces the same output as above */
proc logistic data=gmxout_center plots(only)=roc;
model dichot(event='1') = / nofit;
roc pred=predprob;
ods select roccurve;
run;
/*
Kathleen Kiernan. https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/2179-2018.pdf
"The choice of the marginal (population-averaged) model or conditional (subject-specific) model often depends on the goal of your analysis: whether you are interested in population-averaged effects or subject-specific effects.
The GEE model is a marginal, or population-averaged model. If you are interested in making predictions about individuals, then you would use GLIMMIX to the fit the conditional model using G-side random effects and obtain the subject specific estimates. For example:"
*/
/* conditional model subject specific estimates */
proc glimmix data=resp2 ;
class id sex treatment visit;
model dichot (event='1')=sex treatment visit age baseline/s dist=binary link=logit;
random intercept /s subject=id;
output out=gmxout_subj_id pred=xbeta_subj_id pred(ilink)=predprob_subj_id pred(ilink
noblup)=fix_predprob_subj_id;
run;
/* with random intercept, clustering by ID */
proc logistic data=gmxout_subj_id plots(only)=roc;
model dichot = predprob_subj_id;
ods select roccurve;
run;
/* C-statistic = 0.88 */
proc glimmix data=resp2 ;
class center sex treatment visit;
model dichot (event='1')=sex treatment visit age baseline/s dist=binary link=logit;
random intercept /s subject=center;
output out=gmxout_subj_cntr pred=xbeta_subj_cntr pred(ilink)=predprob_subj_cntr pred(ilink
noblup)=fix_predprob_subj_cntr;
run;
/* with random intercept, clustering by center */
proc logistic data=gmxout_subj_cntr plots(only)=roc;
model dichot = predprob_subj_cntr;
ods select roccurve;
run;
/* C-statistic = 0.79 */
... View more