DATA Step, Macro, Functions and more

Total Sum vertically

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 86
Accepted Solution

Total Sum vertically

Hi

for one of my problem, I would need to have total sum of the account

for example, I have a dataset like this:

ID      Bill

1        25

1         35

1        25

2      11

2        23

4        2

4        1

4         3

Can we have BIll_sum, a third variable totalling by ID like this:

ID      Bill      Bill_sum

1        25         85

1         35         85

1        25          85

2      11             34

2        23           34

4        2             6

4        1             6

4         3           6

Thanks  a lot


Accepted Solutions
Solution
‎01-28-2013 05:47 AM
Trusted Advisor
Posts: 1,128

Re: Total Sum vertically

Hi,

try this

data have;

    input iD  Bill;

cards;

1        25

1        35

1        25

2        11

2        23

4        2

4        1

4        3

;

run;

proc sort data=have;

    by id;

run;

data want;

    set have;

    by id;

    retain sum 0 sum_;

    if first.id then sum=bill;

    else sum+bill;

    if last.id then output;

run;

proc sort data=want(keep=id sum);

    by id;

run;

data want_;

    merge have want;

    by id;

run;

Thanks,

Jagadish

Thanks,
Jag

View solution in original post


All Replies
Solution
‎01-28-2013 05:47 AM
Trusted Advisor
Posts: 1,128

Re: Total Sum vertically

Hi,

try this

data have;

    input iD  Bill;

cards;

1        25

1        35

1        25

2        11

2        23

4        2

4        1

4        3

;

run;

proc sort data=have;

    by id;

run;

data want;

    set have;

    by id;

    retain sum 0 sum_;

    if first.id then sum=bill;

    else sum+bill;

    if last.id then output;

run;

proc sort data=want(keep=id sum);

    by id;

run;

data want_;

    merge have want;

    by id;

run;

Thanks,

Jagadish

Thanks,
Jag
Respected Advisor
Posts: 3,777

Re: Total Sum vertically

You don't need to sort data WANT, you did not change the sort order when you created it from sorted SAVE.

Trusted Advisor
Posts: 1,128

Re: Total Sum vertically

Thank you for the feedback.

You are perfectly right, we can avoid that step. I just mentioned it for a better understanding.

Thanks,

Jagadish

Thanks,
Jag
Super User
Posts: 5,256

Re: Total Sum vertically

Using SQL, it can be solved in one step:

Proc sql;

     create table want as

          select id, bill, sum(bill) as bill_sum

               from have

               group by id

     ;

quit;

Data never sleeps
Regular Contributor
Posts: 151

Re: Total Sum vertically

or proc sql can do this simply...

proc sql;

create table want as select

*, sum(Bill) as Bill_sum

from have

group by id;

quit;

Frequent Contributor
Posts: 86

Re: Total Sum vertically

Hi All

Yes, there are options like Proc sql and group by  and also there are possibilities like calculating and merging back.

What I was looking for a possibility to have summation in a datastep something like advance lookup or arrays possibilites

Thanks

Super User
Posts: 5,256

Re: Total Sum vertically

???

What is the requirement for that?

Anyway, if data is already sorted in id order, use retain together with first. and last. processing.

Data never sleeps
Frequent Contributor
Posts: 86

Re: Total Sum vertically

Hi , the problem is I have a huge dataset with more than a 100 million rows. calculating and merging back would not be very

efficient I think. Second option is proc sql. I have realized, it can be done with proc sql but what if I would like to sum with multiple criteria like I want to sum only positive observations of any column. that would filter the output and again, I need to merge back.

Frequent Learner
Posts: 1

Re: Total Sum vertically

Hi ,

you can try this one also as it will also help you to get 2nd highest sum by category also:


DATA TEST;
INPUT ID SAL;
CARDS;
1 200
2 300
3 400
1 600
2 800
3 600
1 800
2 600
3 500
;
RUN;


PROC SORT DATA =TEST ;
BY ID;
RUN;

DATA TEST1;
SET TEST;
BY ID;
RETAIN TOT 0 CN 0 ;
IF FIRST.ID THEN TOT=SAL;
ELSE TOT=SUM(TOT,SAL);
IF FIRST.ID THEN CN=1;
ELSE CN=CN+1;
RUN;

PROC SORT DATA =TEST1 ;
BY ID;
RUN;

DATA TEST2;
SET TEST1 ;
BY ID ;
if last.id then output test2;
RENAME CN=CNT TOT=SUM_TOT;
RUN;

data test3(DROP=TOT CN);
MERGE test2 test1;
by id;
tot_avg=SUM_TOT/cnt;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 11127 views
  • 6 likes
  • 6 in conversation