BookmarkSubscribeRSS Feed
littlestone
Fluorite | Level 6

Hello, All

Suppose I have following dataset (the response is continous, group is ordinal (4 levels)).

data test;

input response group;

cards;

75 1

45 1

89 1

47 1

59 2

100 2

76 2

123 2

65 3

43 3

45 3

150 3

56 4

67 4

89 4

100 4

;

The response relates to the the probability (risk) of getting a disease; the bigger the response, the higher the risk.

The goal is to get an estimate of risk ratio; how should I do that?

9 REPLIES 9
Ksharp
Super User

proc logistic        ?

littlestone
Fluorite | Level 6

Thank you for help.

I will explain the data in more detail:

  • The response relates to the the probability (risk) of getting the disease; the bigger the response, the higher the risk
  • group 1: healthy people
  • group 2: minor illness
  • group 3: intermediate illness
  • group 4: severe people

The purpose: using response as an indicator, obtain the relative risk between groups, e.g. group=1 is the control group

sivaji
Fluorite | Level 6

here is the methodology,

http://bioterrorism.slu.edu/bt/products/bio_epi/scripts/mod12.pdf

do let me know if you need any more help

HTH

Sivaji

art297
Opal | Level 21

I've never done anything in the area of Health, but have definitely worked with the concept of risk when I worked in the insurance industry.  I've noticed that the terms relative risk and risk ratios have been used, synonimously, in a number of areas.  However, their definitions of always implied likelihood and being able to compare groups.

In insurance, claim frequency would be such a measure, as it is simply the likelihood of an event occuring.  Unlike the definitions I've seen for relative risk, where one is set to equal no difference between the risks of two groups and numbers greater or less than 1 indicative of more or less risk, such a definition loses the benefit of the basic measure.

When 0 means no risk, and 1.0 mean certainty of an event occuring, any number in between those numbers has the properties needed to meet most statistical assumptions.  I.e., a risk of .5 is twice as great as a risk of .25, etc.

And, according to most of the literature I've read, frequency of an event occuring follows a Poisson distribution, thus the transformation necessary to normalize a distribution is known.

In short, before trying to give you an answer, my suggestion would be for you to first ask the researchers you are doing this for, exactly what they are expecting to achieve and how the metric should be calculated.

littlestone
Fluorite | Level 6

honestly, I am also confused by "relative risk" "risk ratio".

I think what data provider wants to know is:

  1. if response increase 1 unit, what is the probability that people will get the disease? Or
  2. for a defined response, what is the probability that people have the disease?
art297
Opal | Level 21

Again, out of my area, but isn't that was the hazard ratio attempts to approximate?  Take a look at: http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_phreg_sect03...

littlestone
Fluorite | Level 6

Thank you very much.  I looked over the article and googled hazard ratio; it seems the methodology relates to survival analysis which I have never done before.

Assuming hazard ratio is what I want, how should I write the sas codes (Proc phreg ?) using data I have provided in this post?

Reeza
Super User

You might want to be careful with your usage of proc phreg. In survival analysis, longer times or response are good, whereas in your example it increases the risk, so not good.

You might want to look at the failure probabilities rather than the survival probabilities in this case.

littlestone
Fluorite | Level 6

Thank you for reminding. I will look deeper into survival analysis.

The data I provided is artificial; right now I just want to use these data to learn how to write sas codes about hazard ratio. But again thank you very much for the reminding.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2654 views
  • 3 likes
  • 5 in conversation