BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chung-Li
Quartz | Level 8

Hi all,

 

While knowing how to estimate hazard ratio for normal case, there has been a question stuck in my mind:

how to estimate hazard ratio for different groups?

This may be too ambiguous, so allows me to explain.

 

# Scenario 1 -- simply estimation of hazard ratio(without adjustment for any confounder)

This is straighforward, by using the following code we can estimate hazard ratio for independent variable: MorningBPSurge

PROC PHREG DATA=one;
	MODEL followtime*allcause(0)=MorningBPSurge / RISKLIMITS;
RUN;

# Scenario 2 -- multivariable hazard ratio (with adjustment for confounding effect)

Also, this is easy to accomplish and easy to understand: we just need to put the confounders we want to adjust into the model.

For example, estimating hazard ratio for independent variable (MorningBPSurge) while adjusting for two confounders (age and sex):

PROC PHREG DATA=one;
	MODEL followtime*allcause(0)=MorningBPSurge Age Sex / RISKLIMITS;
RUN;

 

However, the next case really confuses me

 

# Scenario 3 -- haza ratio for different groups while adjusting for confounders

Let's say we what to know if the hazard ratio for independent variable varies between two groups, and also, we need to take into account confounding effect.

For example, we want to know if the hazard ratio of MorningBPSurge differs for group A and group B.

 Thus, we have

A) Independet variable: MorningBPSurge

B) Group variable: H_BP_Night, 0 if this patient has no hypertension during nighttime; 1 if he/she has.

C) Confounders: Age, Sex

What I usually do is that I use statment "by" to get HR within each group:

PROC PHREG DATA=one;
		BY H_BP_Night;
	MODEL followtime*allcause(0)=MorningBPSurge Age Sex / RISKLIMITS;
RUN;

Then, regarding to P-value, I create interaction term "MorningBPSurge * H_BP_Night" to see if it is significant:

PROC PHREG DATA=one;
	MODEL followtime*allcause(0)=MorningBPSurge Age Sex 
	      H_BP_Night MorningBPSurge*H_BP_Night/ RISKLIMITS;
RUN;

As you probably understand the situation, my questions are

1) Is the using of "by" and interaction term correct for what I want?

2) Do I really need to do this separately, or is there any exsited statment can achieve this?

 

Very welcom if you guys have any idea about this.

Thanks in advance!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
JacobSimonsen
Barite | Level 11

I will also encourage you to use the "hazardratio" statement. It can easily calculate the estimated hazardratio for each level of some effectmodificator (effectmodification = interaction, just an other word for same thing).

 

A little problem I have with this is that it doesnt always select the reference level correctly when using the glm-parametrizaion. If the default parametrization is used, then both maineffects and interaction effects should be in the modelstatement.

 

Here a simple example.

data mydata;
  do group=1,2;
    do exposure='yes','no';
	  do i=1 to 10000;
	  rate=0.1*(3**(group=2))*(1.5**((exposure='yes')*(group=1))) *  2**((exposure='yes')*(group=2)) ;
     t=rand('exponential',1/rate);
    output;
	end;
	end;
	end;
run;

*estimate the exposure effect for each of the two groups:;
proc phreg data=mydata;
  class group exposure(ref='no');
  model t=exposure*group exposure group;
  hazardratio exposure/at(group=all) dif=ref;
run;

*Same again, but with glm-parametrization - then it is enough only to specifiy interaction as that include here maineffects;
proc phreg data=mydata;
  class group exposure(ref='no')/param=glm;
  model t=exposure*group;
  hazardratio exposure/at(group=all) dif=ref;
run;

 

 

If you use the "by" statement, then you allow the model to have a different baseline hazard function for each value of the by-variable. So it will not give exactly same estimate. That will also give a little loss in statistical power.

 

Good luck;-)

View solution in original post

5 REPLIES 5
Reeza
Super User

Have you looked at using the HAZARDRATIO statement specifically?

It allows you to specify exactly what you're looking for as far as I see.

 

Docs:

http://documentation.sas.com/?docsetId=statug&docsetVersion=14.2&docsetTarget=statug_phreg_syntax13....

 

A whitepaper on several options:

https://support.sas.com/resources/papers/proceedings10/253-2010.pdf

Reeza
Super User

Using a BY to get an estimate for each is incorrect - that runs an individual model for each level of your BY variable. It doesn't sound like that's what you're actually interested in or at least in my experience that wouldn't be correct.

Chung-Li
Quartz | Level 8
Reeza,

Thank you for this!
Finally, I understand what's wrong with using "by".
Also, thank you for providing me the reference of "hazardratio".
JacobSimonsen
Barite | Level 11

I will also encourage you to use the "hazardratio" statement. It can easily calculate the estimated hazardratio for each level of some effectmodificator (effectmodification = interaction, just an other word for same thing).

 

A little problem I have with this is that it doesnt always select the reference level correctly when using the glm-parametrizaion. If the default parametrization is used, then both maineffects and interaction effects should be in the modelstatement.

 

Here a simple example.

data mydata;
  do group=1,2;
    do exposure='yes','no';
	  do i=1 to 10000;
	  rate=0.1*(3**(group=2))*(1.5**((exposure='yes')*(group=1))) *  2**((exposure='yes')*(group=2)) ;
     t=rand('exponential',1/rate);
    output;
	end;
	end;
	end;
run;

*estimate the exposure effect for each of the two groups:;
proc phreg data=mydata;
  class group exposure(ref='no');
  model t=exposure*group exposure group;
  hazardratio exposure/at(group=all) dif=ref;
run;

*Same again, but with glm-parametrization - then it is enough only to specifiy interaction as that include here maineffects;
proc phreg data=mydata;
  class group exposure(ref='no')/param=glm;
  model t=exposure*group;
  hazardratio exposure/at(group=all) dif=ref;
run;

 

 

If you use the "by" statement, then you allow the model to have a different baseline hazard function for each value of the by-variable. So it will not give exactly same estimate. That will also give a little loss in statistical power.

 

Good luck;-)

Chung-Li
Quartz | Level 8

Jacob,

 

Thank you for providing not only concept but also SAS code!

After checking reference that Reeza gave me, I've created SAS code nearly the same as you gave to me.

Hazardratio is a so powerful statement that make users more convenient to get the estimation.

Thank you again for introducing this poweful tool to me!

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1896 views
  • 2 likes
  • 3 in conversation