BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bal23
Lapis Lazuli | Level 10

I am requried to calulate relative risks. Certainly, I am interested the codes that can do both works, calculating RR and OR.

Thanks.

Reeza
Super User

I'm too lazy to make sample data. 

If you post sample data and sample calculations someone can help with the code.

Bal23
Lapis Lazuli | Level 10
 
Bal23
Lapis Lazuli | Level 10

relative risk or risk ratio (RR) is the ratio of the probability of an event occurring (for example, developing a disease, being injured) in an exposed group to the probability of the event occurring in a comparison, non-exposed group. Relative risk includes two important features: (i) a comparison of risk between two "exposures" puts risks in context, and (ii) "exposure" is ensured by having proper denominators for each group representing the exposure [1][2]

RR={\frac {p_{\text{event when exposed}}}{p_{\text{event when non-exposed}}}}Risk Disease status Present Absent Smoker Non-smoker

ab
cd

Consider an example where the probability of developing lung cancer among smokers was 20% and among non-smokers 1%. This situation is expressed in the 2 × 2 table to the right.

Here, a = 20, b = 80, c = 1, and d = 99. Then the relative risk of cancer associated with smoking would be

RR={\frac {a/(a+b)}{c/(c+d)}}={\frac {20/100}{1/100}}=20.

Smokers would be twenty times as likely as non-smok

 

please note the first group's RR=1

ballardw
Super User

Your example data only shows the numbers of the event surgery, 2126 (implies nonsurger of 217486-2126. You do not have an exposure. The rate you calculated was a prevalence rate of surgery within the age group.

 

agegroup1

217486

2126

2126/21746*100

 

Note that the definition you post shows that the exposure group, which I have to assume, you have been attempting to use agegroup, can only have two levels. You can do RR for agegroup1 vs agegroup2, agegroup1 vs agegroup3 and agegroup2 vs agegroup3 but not all 3 at once. What you are attempting really is much more in the way of odds ratios where compare the ratio to a base group, likely the one with the lowest prevalence.

 

The value you show 2126/21746 * 100 = 9.776. I could not tell from your data how you get a relative risk of 1, there is not another exposure group. Please show the values for a, b, c and d you used to get a 1.

Bal23
Lapis Lazuli | Level 10

Here it is.

For the first one, RR is set to be 1 as the reference. All other calculations are based on this one.

please see a file attached

Reeza
Super User

 

data want;
set have;
retain denom;
if _n_=1 then denom=c/d;
relrisk=(c/d)/denom;
run;
Bal23
Lapis Lazuli | Level 10

This is why at first I was asking to use SAS macro/array because there are so many rows and columns, it is time  consuming if I type c, d, e, f, or replace it one by one

Reeza
Super User

@Bal23 wrote:

so it really does not matter how much my data looks like, I am looking for a sample code to deal with data with >2 catogories


 

The code answers the above.

 


@Bal23 wrote:

This is why at first I was asking to use SAS macro/array because there are so many rows and columns, it is time  consuming if I type c, d, e, f, or replace it one by one


 

 

Your sample data shows none of this. 

Show sample data that reflects your problem. 

As for dealing with rows, SAS automatically loops through rows so you don't need to deal with rows. 

Bal23
Lapis Lazuli | Level 10

I am sorry i CANNOT understand the code much. How to define c and d. They are from my table, but not from a dataset.

All Cs have same variable name, right? can you explain a little more about it? Or a code in more details? Thanks

Reeza
Super User

Here's a full worked example, sample data created based on file. 

*Create sample data;
data have;
input row agegroup $ b d;
cards;
1 age1 371130 11243
2 age2 214144 2214
3 age3 181841 5820
4 age4 168065 3288
;
run;

data want;
set have;

*tell sas to keep the denom across the rows;
retain denom;

*create the denominator value;
if _n_=1 then do; 
	rr=1;
	denom=d/b;
end;

*calculate relative risk;
else rr=(d/b)/denom;

*calculate percent;
pct=d/b;

*format variables for appearance;
format pct percent8.1 rr 8.2;

run;

*print results;
proc print data=want;
run;
Bal23
Lapis Lazuli | Level 10

Thank you. I guess this code can be done. But it will do it one by one, but I have so many groups  to calculate. Please see my groups below. Some groups have four categories. Some have too, some have six. Then if I use your code, I need to do one by one. That is why I am asking for sas macro, or sas array to do that. Thank you again

*Create sample data;
data have;
input group $ b d;
cards;
A 371130 11243
N 214144 2214
M 181841 5820
Fw 168065 3288

a  779199 17987
b  155981 4578

S17 12     .
S20 603224 14766
S25 263337 5918
O25 68607 1881
W   689531 17304

	
W	689531	17304
B 159304	3714
Ot 86345	1547
	
B 1766	76
S 806844	19491
 Co	74217	2014
Bhigh	52171	979
Unbp	182	5	
	
99su	66736	1373
92or	376550	8726
64p	259887	6356
Reeza
Super User

Then I would say you need BY group processing.

Create a variable that identifies your group, say called MYGROUP.

 

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000761931.htm

 

 

data have;
input mygroup $ agegroup $ b d;
cards;
A age1 371130 11243
A age2 214144 2214
A age3 181841 5820
A age4 168065 3288
B age1 371130 11243
B age2 214144 2214
B age3 181841 5820
B age4 168065 3288
;
run;

data want;
set have;
by mygroup;

*tell sas to keep the denom across the rows;
retain denom;

*create the denominator value;
if first.mygroup then do; 
	rr=1;
	denom=d/b;
end;

*calculate relative risk;
else rr=(d/b)/denom;

*calculate percent;
pct=d/b;

*format variables for appearance;
format pct percent8.1 rr 8.2;

run;

 

 

 

 

Bal23
Lapis Lazuli | Level 10

Thanks.

 

please review

data want;
set have;

*tell sas to keep the denom across the rows;
retain denom;

*create the denominator value;
if _n_=1 then do; 
	rr=1;
	denom=d/b;
end;

*calculate relative risk;
else rr=(d/b)/denom;

*calculate percent and 95%CI;
pct=d/b;
J=SQRT((((B-D)/D)/B)+(((B-D)/D)/B))
Lcl==EXP(LN(pct)-(1.96*(J)))
Ucl==EXP(LN(pct)-(1.96*(J)))


*format variables for appearance;
format pct percent8.1 rr 8.2;

run;

my sample code

Reeza
Super User

You missed a line (3, BY), how does SAS know where each group starts and ends?

 

Take some time to read the documentation and how this is handled. 

 

https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001283274.htm

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 37 replies
  • 4997 views
  • 4 likes
  • 3 in conversation