Solved
Contributor
Posts: 65

How to choose strata variable(s) in PROC PHREG?

Hi everyone;

I am working with a data set of patients containing lots of demographic covariates and some health related explanatory variables. I want to include some of them (of direct interest) into MODEL statements and some in STRATA but get confused to choose among them. Any helpful comments would be appreciative.

Thanks!

Accepted Solutions
Solution
‎08-01-2012 03:41 PM
Posts: 2,116

Re: How to choose strata variable(s) in PROC PHREG?

Isaac,

Harrell's book "regression modelling Strategies" has good advice on model building.  It needs a new edition for examples (both the S+ and SAS code are old), but the thought processes are good.

I generally only put a variable in as a STRATA variable if the proportional hazards assumption is not met; otherwise use a CLASS statement. One way to check proportionality is to plot the unadjusted survival curves by the class variable.  If they are approximately parallel, then the assumption holds reasonably.  Another way, to more formally test it, is to use the ASSESS statement in PHREG.

Doc Muhlbaier

Duke

All Replies
Solution
‎08-01-2012 03:41 PM
Posts: 2,116

Re: How to choose strata variable(s) in PROC PHREG?

Isaac,

Harrell's book "regression modelling Strategies" has good advice on model building.  It needs a new edition for examples (both the S+ and SAS code are old), but the thought processes are good.

I generally only put a variable in as a STRATA variable if the proportional hazards assumption is not met; otherwise use a CLASS statement. One way to check proportionality is to plot the unadjusted survival curves by the class variable.  If they are approximately parallel, then the assumption holds reasonably.  Another way, to more formally test it, is to use the ASSESS statement in PHREG.

Doc Muhlbaier

Duke

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

Doc Muhlbaier;

Have tested PH assumptions on some covariates and seen the violations. After I put them in the CLASS and ended up with another problem. Most of those have large number of levels, say, Primary Care Team with 54 levels, Diagnostic Related Group (DRG) with more than 85 levels, Principal Diagnosis with near 60 levels, leading to huge dimension of design matrix and further ambiguous Global Test Results (P-Value {LR} = 0.67; P-Value {Score} < 0.001; P-Value {Wald} = 1) . I also examined different Effect Selection methods but no improvement has gained. What would you recommend to deal with this? Thanks so much!

Posts: 2,116

Re: How to choose strata variable(s) in PROC PHREG?

Your model is probably over specified.  Another reason to look at Harrell's book is for his sage advice on the sample size needed relative to the number of outcomes (not total sample size, but the number of failures).  I don't have the book at home, but I think that it is 10-15 outcomes per degree of freedom in the fully specified model.  You have about 200 d.f. which requires 2000-3000 failures.  Likely you don't have that.  This requires some hard choices and likely needs some clinical input to collapse the categories in a meaningful way.

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

The data set has 3108 records, with 372 event times (near 88% of records right-censored). So instead of performing clinical trials, isn't there any way to overcome this problem? Perhaps grouping levels with some techniques? Thanks!

Posts: 2,116

Re: How to choose strata variable(s) in PROC PHREG?

88% right censoring is not unusual; after all, most patients survive (we certainly hope so!).  Your target is about 35 d.f.  I would start by dumping either DRG or Principal Dx.  They are highly correlated (If you do a PROC FREQ on DX*DRG you will see a sparse matrix).  Then you are going to need to combine the different levels of the remaining ones; there are already some documents in the literature of ways to combine either Dx or DRG into groups with some cohesiveness.  Lastly, you've got to get the Physician Care Teams pared down; maybe combine by specialty or location.

I say this, fully expecting that your management would like to compare the Physician Care teams.  You just don't have enough data to do that.

One possibility to explore is to totally shift gears out of survival analysis.  Maybe some sort of cost measure would be appropriate.  Then you have a continuous outcome and can reasonably have more d.f.  (Check Tsiatis and Angstrom for some papers on analyzing cost data; there are some important nuances to be aware of.  (They will have some references or be referred to by others in the field.).

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

Dr Muhlbaier

Thanks so much for your helpful comments. Have found Tsiatis's papers on the topic but didn't find something from Angstrom. And is there a specific keyword I should apply?

Posts: 2,116

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

Dr. Muhlbaier

Is the 35 d.f. should be served for MODEL variables or for both MODEL and STRATA variables all together?

Posts: 2,116

Re: How to choose strata variable(s) in PROC PHREG?

Issac,

I don't really know here.  Remember these are guidelines, not mathematical proofs, so there is some wiggle room.  You might be able to not count the strata in the total d.f., but what you risk are some false positives.  If you include the strata in an interaction term, then the d.f. there definitely count.

You might want to do some bootstrap resampling to get some handle on the variability of the estimates.

Doc

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

Dr Muhlbaier;

I have got some answers for DRG and Principal Diagnostic. Actually the VA systems assign the Principal Diagnostic based on International Classification of Disease (ICD09) and by looking at them at this link

http://icd9cm.chrisendres.com/index.php?action=contents , I can group them into more summarized group and shrink their levels. Meanwhile,this is the case for DRG also, since I found that "DRGs may be further grouped into Major Diagnostic Categories (MDCs)", and hence by this data set, http://www.cms.hhs.gov/AcuteInpatientPPS/downloads/FY_2010_FR_Table_5.zip, I wanna do the same thing for DRG.

Contributor
Posts: 65

Re: How to choose strata variable(s) in PROC PHREG?

Dr Muhlbaier;

For a CLASS variable, I find that PH assumption is satisfied for one level but is not with another level. In words, PH is the case for (admissionsource = NHCU) but not validated for (admissionsource = domiciliary). So what should I do? put "admissionsource" in STRATA or not?

Thanks!

🔒 This topic is solved and locked.