Hello everyone,
I am calculating risk differences with the lag function, I have a variable exposure category (with 3 exposure levels), I want to calculate the difference based on the reference category, but I have not succeeded, any suggestions?
help please
Hello everyone, I am calculating risk differences with the lag function, I have a variable exposure category (with 3 exposure levels), I want to calculate the difference based on the reference category, but I have not succeeded, any suggestions? help please.
I am using this code, there is no problem for the difference between category 2 with the reference, but the calculation of the third category is no longer done with the reference, rather with the previous category (which I understand is what happens with the lag function) is there any way to set a reference category?
data test;
set test;
by time exposure;
rd = risk - lag(risk);
rr = risk / lag(risk);
if first.time then rd=.;
if first.time then rr=.;
run;
Please amend below HAVE sample data if it is not suitable for your problem.
Once you've created good enough sample data how should the desired result look like?
data have;
infile datalines dsd truncover;
input time exposure risk;
datalines;
1,1,1
1,1,2
1,2,3
2,1,1
2,3,3
;
data want;
set have;
by time exposure;
rd = risk - lag(risk);
rr = risk / lag(risk);
if first.time then call missing(rd,rr);
run;
proc print data=want;
run;
I will try, thank you
What is the reference category?
Please also supply usable example data (in a working data step with datalines), and the expected result.
The exposure category has 3 values (0, 1 and 2) and the reference category is 0.
Example data:
data test;
input exposure time risk:
datalines;
0 0 0.01413
1 0 0.0125
2 0 0.1077
0 1 0.0491
1 1 0.0838
2 1 0.0678
;
I want to estimate the risk difference based on the reference category using the lag function.
Using your sample data how would the desired result look like? Providing such sample want data often helps a lot to clarify the requirements.
Not having such want data I'm not sure if I've got below formulas right - but the code should anyway give you the idea.
data test;
input exposure time risk;
datalines;
0 0 0.01413
1 0 0.0125
2 0 0.1077
0 1 0.0491
1 1 0.0838
2 1 0.0678
;
data want;
set test;
by time exposure;
array risks {0:2} 8 _temporary_;
risks[exposure]=risk;
if last.time then
do;
rd = risks[0] - risks[1];
rr = risks[0] / risks[2];
output;
call missing(of risks[*]);
end;
keep time rd rr;
run;
proc print data=want;
run;
Note: With a _temporary_ array the values get retained.
Thank you very much!
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.