BookmarkSubscribeRSS Feed
Xueping
Calcite | Level 5

I am doing a logistic regression: Y = Treatment group + Covariate1 + Covariate2 ..+Covariate4.  

      Y is a binomial outcome variable: patient response, patient non-response.

      Treatment group is the factor I am interested in, which has Group A and Group control.

      The other 4 covariates are all categorical, some has two levels, some has more. For those covariates with more than 2 levels, dummy covariates are used. 

       Each patient has only one observation, that is, no repeated measures.

 

I want to get such results from the model:  Odds ratio of treatment group, and its 95% CI, p-value, and the response rate in each treatment group adjusted for the covariates.

 

Regarding OR, CI and p-value, I get the same results from PROC LOGISTIC and PROC GENMOD. 

 

The problem is the response rate in each treatment group adjusted for the covariates.

 

My questions are:

1.  I don't find any option in both PROCs to provide such an adjusted response rate for both treatment groups. Is anybody know such options?

2. I calculated the adjusted response rate in this way:  

   firstly, I get predicted probability(p) from the model for each patient;  

   then,  I calculate the logit for each patient, which is log(p/(1-p)) ;  

   thirdly, I calculate the mean logit in both treatment groups;

   fourthly, derive the adjusted response rate of each treatment group =   exp(MeanLogit) / (1+exp(MeanLogit) )

 

I am not sure if my calculation steps are correct. Can anyone give me comments?

 

3. The predicted probability from PROC LOGISTIC and PROC GENMOD are totally different, even not on the same magnitude.  I don't understand. Can anyone explain?

Therefore, my above calculated adjusted response rates from both models are very different. I don't know which one to trust.

 

Many thanks in advance for any help!!!

 

6 REPLIES 6
Reeza
Super User

1. Why are you calculating predicted probability by hand? Are the predicted values outputed from the procs different?

2. Are you sure that's the correct method to calculate the response rate? Have you looked at effectplot and/or estimate statements? 

Xueping
Calcite | Level 5

Hi Reeza, thanks for your reply!

 

1. I don't calculate predicted probability by hand. I mean I get it from the output. They are different from PROC LOGISTIC and PROC GENMOD, although the estimated odds ratios, CI, p-values from both procs are the same.

 

2. I am not sure if my method is correct, that's why I ask here. What do you mean effectplot? Could you explain more detail?

 

 

PGStats
Opal | Level 21

It would be helpful to show some code. 

 

I suspect that the problem stems from your coding of categorical covariates as dummy variables. You should use CLASS variables instead. CLASS effects and dummy (continuous) variables are not treated the same way in LSMEANS calculations.

PG
Xueping
Calcite | Level 5

Hi PG, thank you very much for your reply. I am not coding the covariates as dummy variables, I think the PROCs treated the covariates as dummy variables since I did put them in CLASS statement.

 

Here is my code:

proc logistic data = DATAIN  descending ;
        class ARM COV1 COV2 COV3 ; 
       model AVAL = ARM COV1 COV2 COV3;
       oddsratio ARM ;
       lsmeans ARM / e diff oddsratio cl ;
       ods output ParameterEstimates = ESTIMSTE_
                                               Type3 = TYPE3_
                              OddsRatiosWald = OR_
                                             ;
      output out = PRED predicted = phat ;
run ;

 

proc genmod data = DATAIN. descending ;
         class ARM COV1 COV2 COV3 ;
         model    AVAL = ARM COV1 COV2 COV3 / dist = bin link = logit type3 lrci ;
         lsmeans ARM / cl oddsratio diff ;
         ods output
                            Type3 = TYPE3_
                           DIFFS = DIFF_
                                    ;
          output   out = PRED1     predicted = phat1 ;
run ;

 

 

The estimates (OR, p-value, Confidence interval of OR) are the same from the two PROCs. But the predicted prabablity, which are the output datasets phat and phat1  are very different. 

 

Do you know how can I get the adjusted response rate for both treatment groups (ARM) ?

 

Xueping

Reeza
Super User

You've verified that the design matrix is the same for both procedures?

And the log,is.clean for both Procs?

Xueping
Calcite | Level 5

Hi Reeza, many thanks!

 

the design matrix is the same in both PROCs. But the log is not clean since my dataset is not large enough. I think if I get more data, the convegence will be ok. Do you think it is the reseaon? Does it mean if I get larger dataset, then the predicted probabilities in both PROCs will be the same? But it is strange that the OR estimates are the same in both PROCs even though there are warnings from both.

 

log from PROC LOGISTIC:

NOTE: PROC LOGISTIC is modeling the probability that AVAL='Response'.
WARNING: There is possibly a quasi-complete separation of data points. The maximum likelihood estimate may not exist.
WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable.
WARNING: The model does not have a GLM parameterization. This parameterization is required for the LSMEANS, LSMESTIMATE, and SLICE statement. These statements are ignored.

 

 

log from PROC GENMOD: 

NOTE: PROC GENMOD is modeling the probability that AVAL='Response'.
WARNING: The negative of the Hessian is not positive definite. The convergence is questionable.
WARNING: The procedure is continuing but the validity of the model fit is questionable.
WARNING: The specified model did not converge.
NOTE: The Pearson chi-square and deviance are not computed since the AGGREGATE option is not specified.
WARNING: Negative of Hessian not positive definite.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 3415 views
  • 1 like
  • 3 in conversation