BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to accumulate by group .

The group is  more than 1 grouping variable.

I want to ask 2 questions please:

1-Why my code is giving wrong results?

2-How can create the calculation by using  multiple FIRST. statements 

 

Data raw_tbl;
input ID group $  group2 $ Y;
cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;


data raw_tbl2;
set raw_tbl;
group_CAT=CATX(',',group,group2);
Run;

proc sort data=raw_tbl2;by group_CAT;Run;
data required;
set raw_tbl2;
by group_CAT;
retain Accum_Y;
if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

What wrong results is it giving?

 

Yes you can have two different FIRST. in the same data step

 

proc sort data = raw_tbl;
    by group group2;
run;

data want;
    set raw_tbl;
    by group group2;
    if first.group2 and first.group; 
    /* Although I don't think the above is what you want, I think you want */
    /* if first.group2; */
    /* but that's up to you */
...
run;
--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

 

If you use the variable group_cat, you just need to reinitialize the counter to 0 each time there is a new Group_cat value:

 

data required;
set raw_tbl2;
by group_CAT;
if first.group_CAT  then  Accum_Y=0;
Accum_Y+y;
run;

You can simplify your code as follow: 

 

 

Data raw_tbl;
	input ID group $  group2 $ Y;
	cards;
1 a One 5
2 a One 10
3 a Two 15
4 a Two 20
5 b One 25
6 b Two 30
7 b Two 35
8 c Two 40
9 c Two 45
10 c One 50
;
run;

proc sort data=raw_tbl;
	by group group2;
Run;

data required;
	set raw_tbl;
	by group group2;
	if first.group2 then Accum_Y=0;
	Accum_Y+y;
run;

 

 

The rational is that when you specify "BY" variables, SAS creates the following internal corresponding variables : Last. and First.

First.<var> = 1 for the first record and 0 otherwise.

In the above example, the first and last values would be as follow. So you just need to reinitialize the counter when First.group2 = 0.

 

ID	group   group2  Y	First.group First.group2
1	a		One 	 5		1		1
2	a		One 	10		0		0
3	a 		Two 	15		0		1
4	a		Two 	20		0		0
5	b		One 	25		1		1
6	b		Two		30		0		1
7	b		Two 	35		0		0
8 	c		Two 	40		1		1
9	c		Two		45		0		0
10	c		One 	50		0		1

 

s_lassen
Meteorite | Level 14

As others have already shown, you can have more than one variable in a BY statement.

 

And I think most of the suggestions you got solved the calculation problem you had. But to answer you question:

1-Why my code is giving wrong results?

I think you are missing an ELSE. Your summation code is

if first.group_CAT  then  Accum_Y=Y;
Accum_Y+y;

Which means that in the first OBS in the group, the accumulated value will be Y+Y, and that error will persist throughout the result data.

Just change it to

if first.group_CAT  then  Accum_Y=Y;
else Accum_Y+y;

and I think you will get the right result.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1151 views
  • 2 likes
  • 4 in conversation