BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Onyx | Level 15

Hello

I want to accumulate by group .

The group is  more than 1 grouping variable.

I want to ask 2 questions please:

1-Why my code is giving wrong results?

2-How can create the calculation by using  multiple FIRST. statements 

 

Data raw_tbl;
input ID group $  group2 $ Y;
cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;


data raw_tbl2;
set raw_tbl;
group_CAT=CATX(',',group,group2);
Run;

proc sort data=raw_tbl2;by group_CAT;Run;
data required;
set raw_tbl2;
by group_CAT;
retain Accum_Y;
if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

What wrong results is it giving?

 

Yes you can have two different FIRST. in the same data step

 

proc sort data = raw_tbl;
    by group group2;
run;

data want;
    set raw_tbl;
    by group group2;
    if first.group2 and first.group; 
    /* Although I don't think the above is what you want, I think you want */
    /* if first.group2; */
    /* but that's up to you */
...
run;
--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

s_lassen
Meteorite | Level 14

As others have already shown, you can have more than one variable in a BY statement.

 

And I think most of the suggestions you got solved the calculation problem you had. But to answer you question:

1-Why my code is giving wrong results?

I think you are missing an ELSE. Your summation code is

if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;

Which means that in the first OBS in the group, the accumulated value will be Y+Y, and that error will persist throughout the result data.

Just change it to

if first.group_CAT  then  Accum_Y=Y;
else Accum_Y+y;

and I think you will get the right result.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1628 views
  • 2 likes
  • 4 in conversation