Obsidian | Level 7

## Running Totals

Hi all,

I have a dataset that currently looks like this

 acc_id curr ever 1 0 1 2 1 1 2 1 1 3 0 1 3 1 1

and I need to add a couple of additional columns to calculate a running total for each id - sum_curr and sum_ever

 acc_id curr ever sum_curr sum_ever 1 1 1 1 1 2 1 1 1 1 2 1 1 2 2 3 0 1 0 1 3 1 1 1 2

I'm struggling to work out how this would be done... any help would be greatly appreciated

thanks 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Running Totals

Assuming that your "want" of id=1 and the value of Curr shown is a typo (changed from 0 to 1) this should work:

```data have;
input acc_id	curr	ever;
datalines;
1	0	1
2	1	1
2	1	1
3	0	1
3	1	1
;

data want;
set have;
by acc_id;
retain sum_curr sum_ever;
if first.acc_id then do;
sum_curr=curr;
sum_ever=ever;
end;
else do;
sum_curr=sum(sum_curr, curr);
sum_ever=Sum(sum_ever, ever);

end;
run;```

Please note the DATA step code to provide example data.

The Want step works with two major bits. The BY statement creates automatic variables Last. and First. that indicate whether the current record is the first or last of a by group. These automatic variables are numeric 1/0 for True/False and can be used to conditionally execute code. In this case to conditionally reset the accumulator variables.

The RETAIN statement creates variables that will hold the value across iterations of the data step.

I am using the SUM function in case your Curr or Ever variables ever have missing values but you want to accumulate across them. The + operator will return missing if any of the values are missing and would result in missing values for the total.

If your data is not actually sorted by ID but is grouped by the Id variable add NOTSORTED to the BY statement so the expected sort order of a BY statement is not violated causing a data error.

2 REPLIES 2
Super User

## Re: Running Totals

Assuming that your "want" of id=1 and the value of Curr shown is a typo (changed from 0 to 1) this should work:

```data have;
input acc_id	curr	ever;
datalines;
1	0	1
2	1	1
2	1	1
3	0	1
3	1	1
;

data want;
set have;
by acc_id;
retain sum_curr sum_ever;
if first.acc_id then do;
sum_curr=curr;
sum_ever=ever;
end;
else do;
sum_curr=sum(sum_curr, curr);
sum_ever=Sum(sum_ever, ever);

end;
run;```

Please note the DATA step code to provide example data.

The Want step works with two major bits. The BY statement creates automatic variables Last. and First. that indicate whether the current record is the first or last of a by group. These automatic variables are numeric 1/0 for True/False and can be used to conditionally execute code. In this case to conditionally reset the accumulator variables.

The RETAIN statement creates variables that will hold the value across iterations of the data step.

I am using the SUM function in case your Curr or Ever variables ever have missing values but you want to accumulate across them. The + operator will return missing if any of the values are missing and would result in missing values for the total.

If your data is not actually sorted by ID but is grouped by the Id variable add NOTSORTED to the BY statement so the expected sort order of a BY statement is not violated causing a data error.

Obsidian | Level 7

## Re: Running Totals

Thank you!!

Discussion stats
• 2 replies
• 361 views
• 0 likes
• 2 in conversation