I have the following raw data.
data have;
do group="a","b";
do date=1 to 10;
x=ranpoi(1,5);
output;
end;
end;
run;group date x a 1 3 a 2 10 a 3 4 a 4 3 a 5 8 a 6 10 a 7 5 a 8 5 a 9 2 a 10 2 b 1 7 b 2 5 b 3 7 b 4 2 b 5 9 b 6 4 b 7 4 b 8 6 b 9 10 b 10 3
How can I calculate the cumulative percents of the number of dates of X=0,1,...,10 by group? For example,
group x count cuper a 0 0 0.0 a 1 0 0.0 a 2 2 0.2 a 3 2 0.4 a 4 1 0.5 a 5 2 0.7 a 6 0 0.7 a 7 0 0.7 a 8 1 0.8 a 9 0 0.8 a 10 2 1.0 b 0 0 0.0 b 1 0 0.0 b 2 1 0.1 b 3 1 0.2 b 4 2 0.4 b 5 1 0.5 b 6 1 0.6 b 7 2 0.8 b 8 0 0.0 b 9 1 1.0 b 10 1 1.0
I am already doing the following, but it seems inefficient spending a lot of lines.
proc sql;
create table _1 as
select unique group,x,n(date) as count
from have
group by group,x
order by group,x;
quit;
data _2;
do group="a","b";
do x=0 to 10;
output;
end;
end;
run;
proc sql;
create table _3 as
select _2.group,_2.x,ifn(count>.,count,0) as count,calculated count/sum(calculated count) as per
from _2 full join _1 on _2.group=_1.group and _2.x=_1.x
group by _2.group
order by group,x;
quit;
data _3;
set _3;
by group x;
if first.group then cuper=0;
cuper+per;
run;Is there a more efficient solution?
If the data is sorted by group:
proc freq data= have noprint; by group; tables date /outcum out=want; run;
The default output here will also have a cumulative count of date and the percent. You can drop those.
IF you have other variables involved then you could merge this data back on based on group and date.
If the data is sorted by group:
proc freq data= have noprint; by group; tables date /outcum out=want; run;
The default output here will also have a cumulative count of date and the percent. You can drop those.
IF you have other variables involved then you could merge this data back on based on group and date.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.