BookmarkSubscribeRSS Feed
avak
Calcite | Level 5

Hi all. I'm running a logistic regression for odds of receiving a skeletal survey in children less than 1 year of age admitted to the hospital for an accidental fall. My database is the National Trauma Data Bank, and there are over 5000 facilities included in my cohort. My model is the following:

 

proc logistic data = logreg.alldata;
model sksurvey (event = '1')= mnatam_pi masian mblack moth_mix 
mhisp maid mother mloc mISS/rsquare;
run;

The first 5 variables are for race, maid = primary payment by medicaid, mother = other primary payment method, mloc = injury occurred outside the home, mISS = injury severity greater than 10. 

 

My Rsquare is very low (0.041), and I think this might be due to variation of the outcome (skeletal survey) by facility, for which I have a variable. I've never created a fixed effects model; would someone who is more familiar with this coding help point me in the right direction? 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

It's not clear why you say R-squared is low, as PROC LOGISTIC doesn't produce an R-squared statistic. So your reasoning for adding a term into the model seems suspect.

 

Nevertheless, you can add facility into the model by putting it in a CLASS statement and then adding facility to the model.

--
Paige Miller
Ksharp
Super User

Can you post the OUTPUT of model ?

Use SELECTION= to shrink your model ,and CORRB to check multi -  collinearity  among variables ,and drop the outliers (obs).

 

model sksurvey (event = '1')= mnatam_pi masian mblack moth_mix 
mhisp maid mother mloc mISS/rsquare selection=stepwise corrb ; 

 

avak
Calcite | Level 5

Here's the output of the original model

Ksharp
Super User

You only have 9 variables in model .

And don't post the parameter estimator table:

      Analysis of Maximum Likelihood Estimates

                                                Standard          Wald
                 Parameter    DF    Estimate       Error    Chi-Square    Pr > ChiSq

                 Intercept     1     -0.6941     10.1967        0.0046        0.9457
                 Age           1      1.1785      0.7807        2.2785        0.1312
                 Weight        1     -0.0829      0.0637        1.6907        0.1935
                 Height        1     -0.1111      0.2534        0.1921        0.6611

And correlation coefficient table :

    
                                    Estimated Correlation Matrix


                                   

Parameter      Intercept          Age       Weight       Height

                   Intercept         1.0000      -0.1298       0.6428      -0.8131
                   Age              -0.1298       1.0000      -0.4392      -0.4047
                   Weight          0.6428      -0.4392       1.0000      -0.5208
                   Height           -0.8131      -0.4047      -0.5208       1.0000


Try FIRTH option.

proc logistic data=sashelp.class;
model sex=age weight height/firth corrb;
run;

 

And if you want enhance AUC a.k.a  C statistic , drop some obs ( outliers) by this code:


proc logistic data=want outest=est(keep=intercept &varlist);
model good_bad(event='good')= &varlist 
/outroc=x.roc lackfit scale=none aggregate rsquare firth;
output out=output h=h c=c cbar=cbar;
run;

proc sort data=output out=check_c ;
 by descending c;
run;
proc sort data=output out=check_h ;
 by descending h;
run;

And in table CHECK_C and CHECK_H ,you will find some outlier (the top n obs) .

and make an ID variable to drop these obs. and fit PROC LOGISTIC with new data again, you will get better AUC .

 

data want;
 set have;
 

id+1;
if id in (237 764 334 93 305 178 918) then delete;

run;

But I prefer to Goodness Of Fit statistic like :

model good_bad(event='good')= &varlist / lackfit scale=none aggregate rsquare firth;   /*  GOF   - if you have SAS9.4m6*/

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 825 views
  • 2 likes
  • 3 in conversation