BookmarkSubscribeRSS Feed
Fluorite | Level 6

Hello, All

Suppose I have following dataset (the response is continous, group is ordinal (4 levels)).

data test;

input response group;


75 1

45 1

89 1

47 1

59 2

100 2

76 2

123 2

65 3

43 3

45 3

150 3

56 4

67 4

89 4

100 4


The response relates to the the probability (risk) of getting a disease; the bigger the response, the higher the risk.

The goal is to get an estimate of risk ratio; how should I do that?

Super User

proc logistic        ?

Fluorite | Level 6

Thank you for help.

I will explain the data in more detail:

  • The response relates to the the probability (risk) of getting the disease; the bigger the response, the higher the risk
  • group 1: healthy people
  • group 2: minor illness
  • group 3: intermediate illness
  • group 4: severe people

The purpose: using response as an indicator, obtain the relative risk between groups, e.g. group=1 is the control group

Fluorite | Level 6

here is the methodology,

do let me know if you need any more help



Opal | Level 21

I've never done anything in the area of Health, but have definitely worked with the concept of risk when I worked in the insurance industry.  I've noticed that the terms relative risk and risk ratios have been used, synonimously, in a number of areas.  However, their definitions of always implied likelihood and being able to compare groups.

In insurance, claim frequency would be such a measure, as it is simply the likelihood of an event occuring.  Unlike the definitions I've seen for relative risk, where one is set to equal no difference between the risks of two groups and numbers greater or less than 1 indicative of more or less risk, such a definition loses the benefit of the basic measure.

When 0 means no risk, and 1.0 mean certainty of an event occuring, any number in between those numbers has the properties needed to meet most statistical assumptions.  I.e., a risk of .5 is twice as great as a risk of .25, etc.

And, according to most of the literature I've read, frequency of an event occuring follows a Poisson distribution, thus the transformation necessary to normalize a distribution is known.

In short, before trying to give you an answer, my suggestion would be for you to first ask the researchers you are doing this for, exactly what they are expecting to achieve and how the metric should be calculated.

Fluorite | Level 6

honestly, I am also confused by "relative risk" "risk ratio".

I think what data provider wants to know is:

  1. if response increase 1 unit, what is the probability that people will get the disease? Or
  2. for a defined response, what is the probability that people have the disease?
Opal | Level 21

Again, out of my area, but isn't that was the hazard ratio attempts to approximate?  Take a look at:

Fluorite | Level 6

Thank you very much.  I looked over the article and googled hazard ratio; it seems the methodology relates to survival analysis which I have never done before.

Assuming hazard ratio is what I want, how should I write the sas codes (Proc phreg ?) using data I have provided in this post?

Super User

You might want to be careful with your usage of proc phreg. In survival analysis, longer times or response are good, whereas in your example it increases the risk, so not good.

You might want to look at the failure probabilities rather than the survival probabilities in this case.

Fluorite | Level 6

Thank you for reminding. I will look deeper into survival analysis.

The data I provided is artificial; right now I just want to use these data to learn how to write sas codes about hazard ratio. But again thank you very much for the reminding.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 5 in conversation