Hi everyone,
I'm trying to sum based on a condition. Data below:
cust_ID | is_loan | balance |
500 | 1 | $7,465 |
500 | 0 | $658 |
500 | 1 | $2,487 |
500 | 0 | $7,824 |
600 | 0 | $232,564 |
600 | 0 | $224 |
600 | 1 | $21 |
600 | 1 | $784 |
I want to do like a
data example;
set example_data;
if is_loan = 1 then sum(balance) output = total_loans;
if is_loan = 0 then sum(balance) output = total_account;
run;
So the output looks like this below:
cust_ID | is_loan | balance | total_loans | total_account |
500 | 1 | $7,465 | $ 9,952.00 | $ 8,482.00 |
500 | 0 | $658 | $ 9,952.00 | $ 8,482.00 |
500 | 1 | $2,487 | $ 9,952.00 | $ 8,482.00 |
500 | 0 | $7,824 | $ 9,952.00 | $ 8,482.00 |
600 | 0 | $232,564 | $ 805.00 | $ 232,788.00 |
600 | 0 | $224 | $ 805.00 | $ 232,788.00 |
600 | 1 | $21 | $ 805.00 | $ 232,788.00 |
600 | 1 | $784 | $ 805.00 | $ 232,788.00 |
So it is suming based on a condition and summing by customer_ID.
Any help is appreciated.
Thanks
Its not really clear what you want here, also post test data in the form of a datastep so we have something to work with. At a guess:
proc sql; create table WANT as select A.CUST_ID, A.IS_LOAN, A.BALANCE, B.TOTAL_LOANS, C.TOTAL_ACCOUNT from HAVE A left join (select CUST_ID,sum(case when IS_LOAN=1 then BALANCE else . end) as TOTAL_LOANS) B on A.CUST_ID=B.CUST_ID left join (select CUST_ID,sum(case when IS_LOAN=0 then BALANCE else . end) as TOTAL_ACCOUNT) C on A.CUST_ID=C.CUST_ID; quit;
While I think the RW9 solution would work, my SQL isn't strong enough to be sure. So here's a DATA step approach:
data want;
total_loans=0;
total_account=0;
do until (last.cust_ID);
set have;
by cust_ID;
if is_loan=0 then total_account + balance;
else total_loans + balance;
run;
do until (last.cust_ID);
set have;
by cust_ID;
output;
end;
run;
It's optional whether you should set the total variables to 0 or missing at the start ... use whichever you would like to appear in your data set if there aren't any such observations.
Just a bit shorter SQL that might do what you want 🙂
proc sql;
select cust_id
,is_loan, balance
,sum(case is_loan when 1 then balance end) as total_loans
,sum(case is_loan when 0 then balance end) as total_account
from example_data
group by cust_id
;
quit;
You can definitely conditionally execute a SUM statement. A SUM statement takes the form
variable + expression;
Here is code to sum the loans, non_loans and everything into separate running totals.
data want ;
set have ;
if is_loan then total_loans+balance;
else total_non_loans+balance;
total_amount+balance;
format total: dollar13. ;
run;
But if you want to have the total BY id then you do not really need (or want) a SUM statement since then you would need to reset the sum when you started and new group. A double DOW works well for calculating over a group and carrying the vlaue onto all rows.
data want ;
do until(last.cust_id);
set have ;
by cust_id;
if is_loan then total_loans=sum(total_loans,balance,0);
else total_non_loans=sum(total_non_loans,balance,0);
end;
total_amount=sum(total_loans,total_non_loans);
do until(last.cust_id);
set have ;
by cust_id;
output;
end;
format total: dollar13. ;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.