Solved
Contributor
Posts: 21

# How do I create a cumulative total column subset by another variable?

Dear Forum,

Although conceptually easy to grasp, I have yet been able to devise a working piece of code that will create a cumulative total column that resets at each break in another variable. For example, given the following sample data set.

ID  AMT

10 150

10 100

25 150

25 150

25 150

30 600

30 300

The result I would like to achieve is a cumulative total column for each distinct ID value (as follows):

ID  AMT SumbyID

10 150 150

10 100 250

25 150 150

25 150 300

25 150 450

30 600 600

30 300 900

Thus far I have attempted various sum statement and do loop combinations using the last.<variable> as a marker,  however I would be interest to know if their is a better method using PROC SQL.

Sincerely,

PQK

Accepted Solutions
Solution
‎06-24-2012 12:33 AM
Super Contributor
Posts: 1,636

## Re: How do I create a cumulative total column subset by another variable?

data have;

input ID  AMT;

cards;

10 150

10 100

25 150

25 150

25 150

30 600

30 300

;

data want;

set have;

by id notsorted;

if first.id then sumbyid=0;

sumbyid+amt;

run;

proc print;run;

Obs    ID    AMT    sumbyid

1     10    150      150

2     10    100      250

3     25    150      150

4     25    150      300

5     25    150      450

6     30    600      600

7     30    300      900

Linlin

All Replies
Solution
‎06-24-2012 12:33 AM
Super Contributor
Posts: 1,636

## Re: How do I create a cumulative total column subset by another variable?

data have;

input ID  AMT;

cards;

10 150

10 100

25 150

25 150

25 150

30 600

30 300

;

data want;

set have;

by id notsorted;

if first.id then sumbyid=0;

sumbyid+amt;

run;

proc print;run;

Obs    ID    AMT    sumbyid

1     10    150      150

2     10    100      250

3     25    150      150

4     25    150      300

5     25    150      450

6     30    600      600

7     30    300      900

Linlin

Contributor
Posts: 21

## Re: How do I create a cumulative total column subset by another variable?

Linlin,

As I suspected, there was an easier way that I completely overlooked. Thank you for the tip!

I forgot to mention that I would need to create a cumulative total column subset by more than one variable. It slipped my mind at the time, but fortunately I resolved the problem by taking the two subsetting variables and concatenating them together to create a new temporary variable. After that, is was just a matter of using the same code example you suggested above.

Thanks again.

PQK

Valued Guide
Posts: 765

## Re: How do I create a cumulative total column subset by another variable?

hi ... if I understand what you mean by a "cumulative total column subset by more than one variable" you can try this too  ...

data x;

input a b c;

datalines;

1 1 10

1 1 10

1 1 10

1 2 20

1 3 30

1 3 30

2 1 100

2 1 100

2 2 50

2 2 60

;

run;

data y;

set x;

by a b;

cumab + (-first.b * cumab) + c;

run;

a    b     c     cumab

1    1     10      10

1    1     10      20

1    1     10      30

1    2     20      20

1    3     30      30

1    3     30      60

2    1    100     100

2    1    100     200

2    2     50      50

2    2     60     110

Occasional Contributor
Posts: 7

## Re: How do I create a cumulative total column subset by another variable?

Mike awesome code can you explain how this is interpreted by SAS:

cumab + (-first.b * cumab) + c;

Valued Guide
Posts: 765

## Re: How do I create a cumulative total column subset by another variable?

hi ... I added the value of FIRST.B to the output to help the explanation ...

Obs    a    b     c     cumab    first_b

1    1    1     10      10        1

2    1    1     10      20        0

3    1    1     10      30        0

4    1    2     20      20        1

5    1    3     30      30        1

6    1    3     30      60        0

7    2    1    100     100        1

8    2    1    100     200        0

9    2    2     50      50        1

10    2    2     60     110        0

values of the variable C are accumulated within values of the two variables A and B

when then first observation is read, the value of CUMAB is zero since that variable is an accumulator variable in a SUM statement ... so for the first observation, in this statement

cumab + (-first.b * cumab) + c

the value within the parentheses is zero and it just adds the value of variable C to CUMAB ... the value of FIRST.B is zero for all the remaining calculations within a group, so the value of the term within parentheses is once again zero ... you are just adding C to CUMAB

when you arrive at the first observation for the next group, the value of FIRST.B is 1 ... you add C to CUMAB, but you also subtract the LAST value of CUMAB from the total, so you are starting at zero again for the next group

that make sense ?

Occasional Contributor
Posts: 7

## Re: How do I create a cumulative total column subset by another variable?

Thanks Mike for explanation

Occasional Contributor
Posts: 5

## Re: How do I create a cumulative total column subset by another variable?

Hi linlin  I think without this syntex it's working, so we don't need this syntax.

if first.id then sumbyid=0;

Occasional Learner
Posts: 1

## Re: How do I create a cumulative total column subset by another variable?

Hi,

I have a similar scenario

data x;

input a b c;

datalines;

1 1 10

2 1 10

3 1 10

4 2 20

5 3 30

6 3 30

7 1 100

8 1 100

9 2 50

;

run;

I want the result as follows

a    b     c     cumab

1    1     10      10

2    1     10      20

3    1     10      30

4    2     20      20

5    3     30      30

6    3     30      60

7    1    100     100

8    1     20     120

9    2     50      50

The column "a" keeps the same order and I want the sum of the column "c" based on the values in the column "b".

Could you help my to calculate the column"cumab" as in the above table?

Thanks and Regards,

Retheesh Ravi.

Super User
Posts: 10,788

## Re: How do I create a cumulative total column subset by another variable?

```
data x;
input a b c;
datalines;
1 1 10
2 1 10
3 1 10
4 2 20
5 3 30
6 3 30
7 1 100
8 1 100
9 2 50
;
run;
data want;
set x;
by b notsorted;
if first.b then cumab=0;
cumab+c;
run;
```

Xia Keshan

Valued Guide
Posts: 765

## Re: How do I create a cumulative total column subset by another variable?

hi ... another way (same result) ...

data y;

set x;

by b notsorted;

cumab + (-first.b * cumab) + c;

run;

New Contributor
Posts: 3

## Re: How do I create a cumulative total column subset by another variable?

I am very new to SAS. Base SAS we can do with datastep. But how can we do it with Proc SQL??

Contributor
Posts: 30

## Re: How do I create a cumulative total column subset by another variable?

Thank you for the examples.  I am a fairly new SAS user and this was very timely.  Is it possible to take this one step further, where the cumulative total is a running total of the previous 5 observations, all under the same subsets?  Thank you.

New Contributor
Posts: 3

## Re: How do I create a cumulative total column subset by another variable?

Good Day every one,

I am new with SAS and i am having a problem in calculating the cumulative sum of values, my table looks like this one

I used this code to calculate the cumulative sum in column C, by respect to id

```data Ratings1;
Set Ratings;
retain c;
if first.id then c=0;
by id notsorted;
c+j1;
run;```

the problem i have is that it gives me as result the same values as in column j1 as if the addition is not working.

Can you help me please , where is the problem in this code

New Contributor
Posts: 3

## Re: How do I create a cumulative total column subset by another variable?

Hi Lamia,

This is quite interestting. To me it looks like as C variable is in orignal data set, so code that you have written is not working. Everytime row is getting read of rating dataset C is getting assigned 0. Here is how we can get around it.

data ratings1;
set ratings(drop=c);
retain c;
if first.id then c=0;
put first.id last.id;
by id notsorted;
c+j1;
run;

☑ This topic is solved.