BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PQK
Calcite | Level 5 PQK
Calcite | Level 5

Dear Forum,

Although conceptually easy to grasp, I have yet been able to devise a working piece of code that will create a cumulative total column that resets at each break in another variable. For example, given the following sample data set.

ID  AMT

10 150

10 100

25 150

25 150

25 150

30 600

30 300

The result I would like to achieve is a cumulative total column for each distinct ID value (as follows):

ID  AMT SumbyID

10 150 150

10 100 250

25 150 150

25 150 300

25 150 450

30 600 600

30 300 900

Thus far I have attempted various sum statement and do loop combinations using the last.<variable> as a marker,  however I would be interest to know if their is a better method using PROC SQL.

Thank you in advance for your help.

Sincerely,

PQK

1 ACCEPTED SOLUTION

Accepted Solutions
Linlin
Lapis Lazuli | Level 10

how about:

data have;

input ID  AMT;

cards;

10 150

10 100

25 150

25 150

25 150

30 600

30 300

;

data want;

  set have;

  by id notsorted;

  if first.id then sumbyid=0;

    sumbyid+amt;

run;

proc print;run;

                            Obs    ID    AMT    sumbyid

                             1     10    150      150

                             2     10    100      250

                             3     25    150      150

                             4     25    150      300

                             5     25    150      450

                             6     30    600      600

                             7     30    300      900

Linlin

View solution in original post

17 REPLIES 17
Linlin
Lapis Lazuli | Level 10

how about:

data have;

input ID  AMT;

cards;

10 150

10 100

25 150

25 150

25 150

30 600

30 300

;

data want;

  set have;

  by id notsorted;

  if first.id then sumbyid=0;

    sumbyid+amt;

run;

proc print;run;

                            Obs    ID    AMT    sumbyid

                             1     10    150      150

                             2     10    100      250

                             3     25    150      150

                             4     25    150      300

                             5     25    150      450

                             6     30    600      600

                             7     30    300      900

Linlin

PQK
Calcite | Level 5 PQK
Calcite | Level 5

Linlin,

As I suspected, there was an easier way that I completely overlooked. Thank you for the tip!Smiley Happy

I forgot to mention that I would need to create a cumulative total column subset by more than one variable. It slipped my mind at the time, but fortunately I resolved the problem by taking the two subsetting variables and concatenating them together to create a new temporary variable. After that, is was just a matter of using the same code example you suggested above.

Thanks again.

PQK

MikeZdeb
Rhodochrosite | Level 12

hi ... if I understand what you mean by a "cumulative total column subset by more than one variable" you can try this too  ...

data x;

input a b c;

datalines;

1 1 10

1 1 10

1 1 10

1 2 20

1 3 30

1 3 30

2 1 100

2 1 100

2 2 50

2 2 60

;

run;

data y;

set x;

by a b;

cumab + (-first.b * cumab) + c;

run;


a    b     c     cumab

1    1     10      10

1    1     10      20

1    1     10      30

1    2     20      20

1    3     30      30

1    3     30      60

2    1    100     100

2    1    100     200

2    2     50      50

2    2     60     110

jimbobob
Quartz | Level 8

Mike awesome code can you explain how this is interpreted by SAS:

cumab + (-first.b * cumab) + c;

MikeZdeb
Rhodochrosite | Level 12

hi ... I added the value of FIRST.B to the output to help the explanation ...

Obs    a    b     c     cumab    first_b

  1    1    1     10      10        1

  2    1    1     10      20        0

  3    1    1     10      30        0

  4    1    2     20      20        1

  5    1    3     30      30        1

  6    1    3     30      60        0

  7    2    1    100     100        1

  8    2    1    100     200        0

  9    2    2     50      50        1

10    2    2     60     110        0

values of the variable C are accumulated within values of the two variables A and B

when then first observation is read, the value of CUMAB is zero since that variable is an accumulator variable in a SUM statement ... so for the first observation, in this statement

cumab + (-first.b * cumab) + c

the value within the parentheses is zero and it just adds the value of variable C to CUMAB ... the value of FIRST.B is zero for all the remaining calculations within a group, so the value of the term within parentheses is once again zero ... you are just adding C to CUMAB

when you arrive at the first observation for the next group, the value of FIRST.B is 1 ... you add C to CUMAB, but you also subtract the LAST value of CUMAB from the total, so you are starting at zero again for the next group

that make sense ?

jimbobob
Quartz | Level 8

Thanks Mike for explanation Smiley Happy

sin_golbarg
Fluorite | Level 6

Hi linlin  I think without this syntex it's working, so we don't need this syntax.

if first.id then sumbyid=0;

retheesh_analytic_gmail_com
Calcite | Level 5

Hi,


I have a similar scenario


data x;

input a b c;

datalines;

1 1 10

2 1 10

3 1 10

4 2 20

5 3 30

6 3 30

7 1 100

8 1 100

9 2 50

;

run;

I want the result as follows

a    b     c     cumab

1    1     10      10

2    1     10      20

3    1     10      30

4    2     20      20

5    3     30      30

6    3     30      60

7    1    100     100

8    1     20     120

9    2     50      50

The column "a" keeps the same order and I want the sum of the column "c" based on the values in the column "b".

Could you help my to calculate the column"cumab" as in the above table?

Thanks and Regards,

Retheesh Ravi.

Ksharp
Super User

data x;
input a b c;
datalines;
1 1 10
2 1 10
3 1 10
4 2 20
5 3 30
6 3 30
7 1 100
8 1 100
9 2 50
;
run;
data want;
 set x;
 by b notsorted;
 if first.b then cumab=0;
 cumab+c;
run;

Xia Keshan

MikeZdeb
Rhodochrosite | Level 12

hi ... another way (same result) ...

data y;

set x;

by b notsorted;

cumab + (-first.b * cumab) + c;

run;

vijaylaxmi
Fluorite | Level 6

I am very new to SAS. Base SAS we can do with datastep. But how can we do it with Proc SQL??

BLarson
Obsidian | Level 7

Thank you for the examples.  I am a fairly new SAS user and this was very timely.  Is it possible to take this one step further, where the cumulative total is a running total of the previous 5 observations, all under the same subsets?  Thank you.

Lamia
Fluorite | Level 6

Good Day every one, 

I am new with SAS and i am having a problem in calculating the cumulative sum of values, my table looks like this one

 Untitled.png

 

I used this code to calculate the cumulative sum in column C, by respect to id

data Ratings1;
Set Ratings;
retain c;
if first.id then c=0;
by id notsorted;
c+j1;
run;

the problem i have is that it gives me as result the same values as in column j1 as if the addition is not working.

Can you help me please , where is the problem in this code

 

thank you in advance

 

vijaylaxmi
Fluorite | Level 6

Hi Lamia,

 

This is quite interestting. To me it looks like as C variable is in orignal data set, so code that you have written is not working. Everytime row is getting read of rating dataset C is getting assigned 0. Here is how we can get around it.

 

data ratings1;
set ratings(drop=c);
retain c;
if first.id then c=0;
put first.id last.id;
by id notsorted;
c+j1;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 95613 views
  • 6 likes
  • 10 in conversation