BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bnarang
Calcite | Level 5

Hi

for one of my problem, I would need to have total sum of the account

for example, I have a dataset like this:

ID      Bill

1        25

1         35

1        25

2      11

2        23

4        2

4        1

4         3

Can we have BIll_sum, a third variable totalling by ID like this:

ID      Bill      Bill_sum

1        25         85

1         35         85

1        25          85

2      11             34

2        23           34

4        2             6

4        1             6

4         3           6

Thanks  a lot

1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

Hi,

try this

data have;

    input iD  Bill;

cards;

1        25

1        35

1        25

2        11

2        23

4        2

4        1

4        3

;

run;

proc sort data=have;

    by id;

run;

data want;

    set have;

    by id;

    retain sum 0 sum_;

    if first.id then sum=bill;

    else sum+bill;

    if last.id then output;

run;

proc sort data=want(keep=id sum);

    by id;

run;

data want_;

    merge have want;

    by id;

run;

Thanks,

Jagadish

Thanks,
Jag

View solution in original post

9 REPLIES 9
Jagadishkatam
Amethyst | Level 16

Hi,

try this

data have;

    input iD  Bill;

cards;

1        25

1        35

1        25

2        11

2        23

4        2

4        1

4        3

;

run;

proc sort data=have;

    by id;

run;

data want;

    set have;

    by id;

    retain sum 0 sum_;

    if first.id then sum=bill;

    else sum+bill;

    if last.id then output;

run;

proc sort data=want(keep=id sum);

    by id;

run;

data want_;

    merge have want;

    by id;

run;

Thanks,

Jagadish

Thanks,
Jag
data_null__
Jade | Level 19

You don't need to sort data WANT, you did not change the sort order when you created it from sorted SAVE.

Jagadishkatam
Amethyst | Level 16

Thank you for the feedback.

You are perfectly right, we can avoid that step. I just mentioned it for a better understanding.

Thanks,

Jagadish

Thanks,
Jag
LinusH
Tourmaline | Level 20

Using SQL, it can be solved in one step:

Proc sql;

     create table want as

          select id, bill, sum(bill) as bill_sum

               from have

               group by id

     ;

quit;

Data never sleeps
Keith
Obsidian | Level 7

or proc sql can do this simply...

proc sql;

create table want as select

*, sum(Bill) as Bill_sum

from have

group by id;

quit;

bnarang
Calcite | Level 5

Hi All

Yes, there are options like Proc sql and group by  and also there are possibilities like calculating and merging back.

What I was looking for a possibility to have summation in a datastep something like advance lookup or arrays possibilites

Thanks

LinusH
Tourmaline | Level 20

???

What is the requirement for that?

Anyway, if data is already sorted in id order, use retain together with first. and last. processing.

Data never sleeps
bnarang
Calcite | Level 5

Hi , the problem is I have a huge dataset with more than a 100 million rows. calculating and merging back would not be very

efficient I think. Second option is proc sql. I have realized, it can be done with proc sql but what if I would like to sum with multiple criteria like I want to sum only positive observations of any column. that would filter the output and again, I need to merge back.

CHANDU_2843
Calcite | Level 5

Hi ,

you can try this one also as it will also help you to get 2nd highest sum by category also:


DATA TEST;
INPUT ID SAL;
CARDS;
1 200
2 300
3 400
1 600
2 800
3 600
1 800
2 600
3 500
;
RUN;


PROC SORT DATA =TEST ;
BY ID;
RUN;

DATA TEST1;
SET TEST;
BY ID;
RETAIN TOT 0 CN 0 ;
IF FIRST.ID THEN TOT=SAL;
ELSE TOT=SUM(TOT,SAL);
IF FIRST.ID THEN CN=1;
ELSE CN=CN+1;
RUN;

PROC SORT DATA =TEST1 ;
BY ID;
RUN;

DATA TEST2;
SET TEST1 ;
BY ID ;
if last.id then output test2;
RENAME CN=CNT TOT=SUM_TOT;
RUN;

data test3(DROP=TOT CN);
MERGE test2 test1;
by id;
tot_avg=SUM_TOT/cnt;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 48447 views
  • 7 likes
  • 6 in conversation