Hello, I am learning SAS and I have this example data:
| race | ID | Fatality | Surface |
| xxx001 | 1 | Y | Dirt |
| xxx001 | 2 | Y | Synth |
| xxx001 | 3 | N | Dirt |
| xxx001 | 4 | N | Dirt |
| xxx001 | 5 | N | Synth |
| xxx001 | 6 | N | Dirt |
| xxx001 | 7 | Y | Synth |
| xxx001 | 8 | Y | Synth |
| xxx002 | 9 | Y | Synth |
| xxx002 | 10 | Y | Dirt |
| xxx002 | 11 | N | Turf |
| xxx002 | 12 | N | Dirt |
| xxx002 | 13 | N | Dirt |
| xxx002 | 14 | N | Dirt |
| xxx002 | 15 | Y | Turf |
| xxx002 | 16 | Y | Turf |
| xxx003 | 17 | N | Synth |
| xxx003 | 18 | N | Dirt |
| xxx003 | 19 | N | Dirt |
| xxx003 | 20 | N | Dirt |
| xxx003 | 21 | Y | Dirt |
| xxx003 | 22 | Y | Dirt |
| xxx003 | 23 | N | Dirt |
What is the best way of calculating fatality and non fatality rates by surface type in SAS?
Thank you
If you like a little Data Step programming then here is the way. Count the Ys and Ns by Surface. Then find Ys/(Ys +Ns) *100 as the Fatality ratio. Then Non Fatality Ratio is 100 - Fatality Ratio.
data have;
input race $6. ID @13 Fatality $1. Surface $;
datalines;
xxx001 1 Y Dirt
xxx001 2 Y Synth
xxx001 3 N Dirt
xxx001 4 N Dirt
xxx001 5 N Synth
xxx001 6 N Dirt
xxx001 7 Y Synth
xxx001 8 Y Synth
xxx002 9 Y Synth
xxx002 10 Y Dirt
xxx002 11 N Turf
xxx002 12 N Dirt
xxx002 13 N Dirt
xxx002 14 N Dirt
xxx002 15 Y Turf
xxx002 16 Y Turf
xxx003 17 N Synth
xxx003 18 N Dirt
xxx003 19 N Dirt
xxx003 20 N Dirt
xxx003 21 Y Dirt
xxx003 22 Y Dirt
xxx003 23 N Dirt
;
run;
data want;
set have end = eof;
surf = 'Dirt Synth Turf';
array Y[3] _temporary_;
array N[3] _temporary_;
i = findw(surf, trim(Surface), ' ', 'E');
if Fatality = 'Y' then Y[i] + 1;
else N[i] + 1;
if eof then do i = 1 to 3;
Surface = scan(surf, i);
FatalityRatio = Y[i] * 100 / (Y[i] + N[i]);
output;
end;
keep Surface FatalityRatio;
run;
The output is:
Obs Surface FatalityRatio 1 Dirt 28.5714 2 Synth 66.6667 3 Turf 66.6667
If you like a little Data Step programming then here is the way. Count the Ys and Ns by Surface. Then find Ys/(Ys +Ns) *100 as the Fatality ratio. Then Non Fatality Ratio is 100 - Fatality Ratio.
data have;
input race $6. ID @13 Fatality $1. Surface $;
datalines;
xxx001 1 Y Dirt
xxx001 2 Y Synth
xxx001 3 N Dirt
xxx001 4 N Dirt
xxx001 5 N Synth
xxx001 6 N Dirt
xxx001 7 Y Synth
xxx001 8 Y Synth
xxx002 9 Y Synth
xxx002 10 Y Dirt
xxx002 11 N Turf
xxx002 12 N Dirt
xxx002 13 N Dirt
xxx002 14 N Dirt
xxx002 15 Y Turf
xxx002 16 Y Turf
xxx003 17 N Synth
xxx003 18 N Dirt
xxx003 19 N Dirt
xxx003 20 N Dirt
xxx003 21 Y Dirt
xxx003 22 Y Dirt
xxx003 23 N Dirt
;
run;
data want;
set have end = eof;
surf = 'Dirt Synth Turf';
array Y[3] _temporary_;
array N[3] _temporary_;
i = findw(surf, trim(Surface), ' ', 'E');
if Fatality = 'Y' then Y[i] + 1;
else N[i] + 1;
if eof then do i = 1 to 3;
Surface = scan(surf, i);
FatalityRatio = Y[i] * 100 / (Y[i] + N[i]);
output;
end;
keep Surface FatalityRatio;
run;
The output is:
Obs Surface FatalityRatio 1 Dirt 28.5714 2 Synth 66.6667 3 Turf 66.6667
Thank you that worked perfectly. How do I do it by year if also have time variable?
data have;
input race $6. ID @13 Fatality $1. Surface $;
datalines;
xxx001 1 Y Dirt
xxx001 2 Y Synth
xxx001 3 N Dirt
xxx001 4 N Dirt
xxx001 5 N Synth
xxx001 6 N Dirt
xxx001 7 Y Synth
xxx001 8 Y Synth
xxx002 9 Y Synth
xxx002 10 Y Dirt
xxx002 11 N Turf
xxx002 12 N Dirt
xxx002 13 N Dirt
xxx002 14 N Dirt
xxx002 15 Y Turf
xxx002 16 Y Turf
xxx003 17 N Synth
xxx003 18 N Dirt
xxx003 19 N Dirt
xxx003 20 N Dirt
xxx003 21 Y Dirt
xxx003 22 Y Dirt
xxx003 23 N Dirt
;
run;
proc sql;
create table want as
select surface, sum(fatality='Y')/n(surface) as Fatality_ratio /*if you want->format=percent5.2*/
from have
group by surface;
quit;
@Reeza 's words have demonstrated
Datastep isn't convenient for a column operation that SQL/MEANS/SUMMARY offers in parallel that can be coded at ease, nonetheless fun stuff
data have;
input race $6. ID @13 Fatality $1. Surface $;
datalines;
xxx001 1 Y Dirt
xxx001 2 Y Synth
xxx001 3 N Dirt
xxx001 4 N Dirt
xxx001 5 N Synth
xxx001 6 N Dirt
xxx001 7 Y Synth
xxx001 8 Y Synth
xxx002 9 Y Synth
xxx002 10 Y Dirt
xxx002 11 N Turf
xxx002 12 N Dirt
xxx002 13 N Dirt
xxx002 14 N Dirt
xxx002 15 Y Turf
xxx002 16 Y Turf
xxx003 17 N Synth
xxx003 18 N Dirt
xxx003 19 N Dirt
xxx003 20 N Dirt
xxx003 21 Y Dirt
xxx003 22 Y Dirt
xxx003 23 N Dirt
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("surface") ;
h.definedata ("surface","_N","_count") ;
h.definedone () ;
dcl hiter hi('h');
end;
do until(z);
set have end=z;
if h.find() ne 0 then do;
_N=1;_count=0;
end;
else _n=sum(_n,1);
_count=sum(_count,Fatality='Y');
h.replace();
end;
do while(hi.next()=0);
Fatality_ratio=_count/_n;
output;
end;
keep Surface Fatality_ratio;
run;
SORT and DOW
data have;
input race $6. ID @13 Fatality $1. Surface $;
datalines;
xxx001 1 Y Dirt
xxx001 2 Y Synth
xxx001 3 N Dirt
xxx001 4 N Dirt
xxx001 5 N Synth
xxx001 6 N Dirt
xxx001 7 Y Synth
xxx001 8 Y Synth
xxx002 9 Y Synth
xxx002 10 Y Dirt
xxx002 11 N Turf
xxx002 12 N Dirt
xxx002 13 N Dirt
xxx002 14 N Dirt
xxx002 15 Y Turf
xxx002 16 Y Turf
xxx003 17 N Synth
xxx003 18 N Dirt
xxx003 19 N Dirt
xxx003 20 N Dirt
xxx003 21 Y Dirt
xxx003 22 Y Dirt
xxx003 23 N Dirt
;
run;
proc sort data=have;
by Surface ;
run;
data want;
do _n_=1 by 1 until(last.surface);
set have;
by surface;
_c=sum(_c,fatality='Y');
end;
Fatality_ratio=_c/_n_;
keep surface Fatality_ratio;
run;
I was sleeping when you inquired on 'year and time'.
I don't understand the input data set for this situation. Please place your question with a new subject line so that a new thread is created which will be current.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.