BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to accumulate by group .

The group is  more than 1 grouping variable.

I want to ask 2 questions please:

1-Why my code is giving wrong results?

2-How can create the calculation by using  multiple FIRST. statements 

 

Data raw_tbl;
input ID group $  group2 $ Y;
cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;


data raw_tbl2;
set raw_tbl;
group_CAT=CATX(',',group,group2);
Run;

proc sort data=raw_tbl2;by group_CAT;Run;
data required;
set raw_tbl2;
by group_CAT;
retain Accum_Y;
if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

What wrong results is it giving?

 

Yes you can have two different FIRST. in the same data step

 

proc sort data = raw_tbl;
    by group group2;
run;

data want;
    set raw_tbl;
    by group group2;
    if first.group2 and first.group; 
    /* Although I don't think the above is what you want, I think you want */
    /* if first.group2; */
    /* but that's up to you */
...
run;
--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

s_lassen
Meteorite | Level 14

As others have already shown, you can have more than one variable in a BY statement.

 

And I think most of the suggestions you got solved the calculation problem you had. But to answer you question:

1-Why my code is giving wrong results?

I think you are missing an ELSE. Your summation code is

if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;

Which means that in the first OBS in the group, the accumulated value will be Y+Y, and that error will persist throughout the result data.

Just change it to

if first.group_CAT  then  Accum_Y=Y;
else Accum_Y+y;

and I think you will get the right result.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1543 views
  • 2 likes
  • 4 in conversation