Hi everyone,
I'm trying to sum based on a condition. Data below:
| cust_ID | is_loan | balance |
| 500 | 1 | $7,465 |
| 500 | 0 | $658 |
| 500 | 1 | $2,487 |
| 500 | 0 | $7,824 |
| 600 | 0 | $232,564 |
| 600 | 0 | $224 |
| 600 | 1 | $21 |
| 600 | 1 | $784 |
I want to do like a
data example;
set example_data;
if is_loan = 1 then sum(balance) output = total_loans;
if is_loan = 0 then sum(balance) output = total_account;
run;
So the output looks like this below:
| cust_ID | is_loan | balance | total_loans | total_account |
| 500 | 1 | $7,465 | $ 9,952.00 | $ 8,482.00 |
| 500 | 0 | $658 | $ 9,952.00 | $ 8,482.00 |
| 500 | 1 | $2,487 | $ 9,952.00 | $ 8,482.00 |
| 500 | 0 | $7,824 | $ 9,952.00 | $ 8,482.00 |
| 600 | 0 | $232,564 | $ 805.00 | $ 232,788.00 |
| 600 | 0 | $224 | $ 805.00 | $ 232,788.00 |
| 600 | 1 | $21 | $ 805.00 | $ 232,788.00 |
| 600 | 1 | $784 | $ 805.00 | $ 232,788.00 |
So it is suming based on a condition and summing by customer_ID.
Any help is appreciated.
Thanks
Its not really clear what you want here, also post test data in the form of a datastep so we have something to work with. At a guess:
proc sql;
create table WANT as
select A.CUST_ID,
A.IS_LOAN,
A.BALANCE,
B.TOTAL_LOANS,
C.TOTAL_ACCOUNT
from HAVE A
left join (select CUST_ID,sum(case when IS_LOAN=1 then BALANCE else . end) as TOTAL_LOANS) B
on A.CUST_ID=B.CUST_ID
left join (select CUST_ID,sum(case when IS_LOAN=0 then BALANCE else . end) as TOTAL_ACCOUNT) C
on A.CUST_ID=C.CUST_ID;
quit;
While I think the RW9 solution would work, my SQL isn't strong enough to be sure. So here's a DATA step approach:
data want;
total_loans=0;
total_account=0;
do until (last.cust_ID);
set have;
by cust_ID;
if is_loan=0 then total_account + balance;
else total_loans + balance;
run;
do until (last.cust_ID);
set have;
by cust_ID;
output;
end;
run;
It's optional whether you should set the total variables to 0 or missing at the start ... use whichever you would like to appear in your data set if there aren't any such observations.
Just a bit shorter SQL that might do what you want 🙂
proc sql;
select cust_id
,is_loan, balance
,sum(case is_loan when 1 then balance end) as total_loans
,sum(case is_loan when 0 then balance end) as total_account
from example_data
group by cust_id
;
quit;
You can definitely conditionally execute a SUM statement. A SUM statement takes the form
variable + expression;
Here is code to sum the loans, non_loans and everything into separate running totals.
data want ;
set have ;
if is_loan then total_loans+balance;
else total_non_loans+balance;
total_amount+balance;
format total: dollar13. ;
run;
But if you want to have the total BY id then you do not really need (or want) a SUM statement since then you would need to reset the sum when you started and new group. A double DOW works well for calculating over a group and carrying the vlaue onto all rows.
data want ;
do until(last.cust_id);
set have ;
by cust_id;
if is_loan then total_loans=sum(total_loans,balance,0);
else total_non_loans=sum(total_non_loans,balance,0);
end;
total_amount=sum(total_loans,total_non_loans);
do until(last.cust_id);
set have ;
by cust_id;
output;
end;
format total: dollar13. ;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.