BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
twenty7
Obsidian | Level 7

Hi all,

 

I have a dataset that currently looks like this

 

acc_idcurrever
101
211
211
301
311

 

and I need to add a couple of additional columns to calculate a running total for each id - sum_curr and sum_ever

 

acc_idcurreversum_currsum_ever
11111
21111
21122
30101
31112

 

I'm struggling to work out how this would be done... any help would be greatly appreciated

 

thanks 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Assuming that your "want" of id=1 and the value of Curr shown is a typo (changed from 0 to 1) this should work:

data have;
   input acc_id	curr	ever;
datalines;
1	0	1
2	1	1
2	1	1
3	0	1
3	1	1
;

data want;
   set have;
   by acc_id;
   retain sum_curr sum_ever;
   if first.acc_id then do;
      sum_curr=curr;
      sum_ever=ever;
   end;
   else do;
      sum_curr=sum(sum_curr, curr);
      sum_ever=Sum(sum_ever, ever);

   end;
run;

Please note the DATA step code to provide example data.

The Want step works with two major bits. The BY statement creates automatic variables Last. and First. that indicate whether the current record is the first or last of a by group. These automatic variables are numeric 1/0 for True/False and can be used to conditionally execute code. In this case to conditionally reset the accumulator variables.

The RETAIN statement creates variables that will hold the value across iterations of the data step.

I am using the SUM function in case your Curr or Ever variables ever have missing values but you want to accumulate across them. The + operator will return missing if any of the values are missing and would result in missing values for the total.

 

If your data is not actually sorted by ID but is grouped by the Id variable add NOTSORTED to the BY statement so the expected sort order of a BY statement is not violated causing a data error.

View solution in original post

2 REPLIES 2
ballardw
Super User

Assuming that your "want" of id=1 and the value of Curr shown is a typo (changed from 0 to 1) this should work:

data have;
   input acc_id	curr	ever;
datalines;
1	0	1
2	1	1
2	1	1
3	0	1
3	1	1
;

data want;
   set have;
   by acc_id;
   retain sum_curr sum_ever;
   if first.acc_id then do;
      sum_curr=curr;
      sum_ever=ever;
   end;
   else do;
      sum_curr=sum(sum_curr, curr);
      sum_ever=Sum(sum_ever, ever);

   end;
run;

Please note the DATA step code to provide example data.

The Want step works with two major bits. The BY statement creates automatic variables Last. and First. that indicate whether the current record is the first or last of a by group. These automatic variables are numeric 1/0 for True/False and can be used to conditionally execute code. In this case to conditionally reset the accumulator variables.

The RETAIN statement creates variables that will hold the value across iterations of the data step.

I am using the SUM function in case your Curr or Ever variables ever have missing values but you want to accumulate across them. The + operator will return missing if any of the values are missing and would result in missing values for the total.

 

If your data is not actually sorted by ID but is grouped by the Id variable add NOTSORTED to the BY statement so the expected sort order of a BY statement is not violated causing a data error.

twenty7
Obsidian | Level 7

Thank you!!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 466 views
  • 0 likes
  • 2 in conversation