BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Junyong
Pyrite | Level 9

I have the following raw data.

data have;
	do group="a","b";
		do date=1 to 10;
			x=ranpoi(1,5);
			output;
		end;
	end;
run;
group date x
a      1    3
a      2   10
a      3    4
a      4    3
a      5    8
a      6   10
a      7    5
a      8    5
a      9    2
a     10    2
b      1    7
b      2    5
b      3    7
b      4    2
b      5    9
b      6    4
b      7    4
b      8    6
b      9   10
b     10    3

How can I calculate the cumulative percents of the number of dates of X=0,1,...,10 by group? For example,

group x  count  cuper
a      0     0    0.0
a      1     0    0.0
a      2     2    0.2
a      3     2    0.4
a      4     1    0.5
a      5     2    0.7
a      6     0    0.7
a      7     0    0.7
a      8     1    0.8
a      9     0    0.8
a     10     2    1.0
b      0     0    0.0
b      1     0    0.0
b      2     1    0.1
b      3     1    0.2
b      4     2    0.4
b      5     1    0.5
b      6     1    0.6
b      7     2    0.8
b      8     0    0.0
b      9     1    1.0
b     10     1    1.0

I am already doing the following, but it seems inefficient spending a lot of lines.

proc sql;
	create table _1 as
		select unique group,x,n(date) as count
		from have
		group by group,x
		order by group,x;
quit;

data _2;
	do group="a","b";
		do x=0 to 10;
			output;
		end;
	end;
run;

proc sql;
	create table _3 as
		select _2.group,_2.x,ifn(count>.,count,0) as count,calculated count/sum(calculated count) as per
		from _2 full join _1 on _2.group=_1.group and _2.x=_1.x
		group by _2.group
		order by group,x;
quit;

data _3;
	set _3;
	by group x;
	if first.group then cuper=0;
	cuper+per;
run;

Is there a more efficient solution?

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If the data is sorted by group:

proc freq data=  have noprint;
   by group;
   tables date /outcum out=want;
run;

The default output here will also have a cumulative count of date and the percent. You can drop those.

 

IF you have other variables involved then you could merge this data back on based on group and date.

View solution in original post

2 REPLIES 2
smantha
Lapis Lazuli | Level 10
Proc freq is an alternative to all this code. You get cumulative count and percentage as well
https://support.sas.com/resources/papers/proceedings/proceedings/sugi25/25/btu/25p069.pdf
ballardw
Super User

If the data is sorted by group:

proc freq data=  have noprint;
   by group;
   tables date /outcum out=want;
run;

The default output here will also have a cumulative count of date and the percent. You can drop those.

 

IF you have other variables involved then you could merge this data back on based on group and date.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1487 views
  • 0 likes
  • 3 in conversation