I have a data where I have Account Number as ACCOUNT_NO, Amount of Transaction as AMT_TXN, the type/flag
of transaction Credit or Debit as AMT_CRDR and date of transaction as DATE_TXN
Something like below data step.
data A;
input ACCOUNT_NO $ AMT_TXN $ AMT_CRDR $ DATE_TXN date7.;
format DATE_TXN date9.;
cards;
1 100 C 02APR2022
2 45 C 03APR2022
4 66 C 04APR2022
2 90 D 04APR2022
3 23 D 05APR2022
5 7 C 05APR2022
6 35 C 09APR2022
8 99 D 01MAY2022
8 100 D 22MAY2022
3 120 C 23MAY2022
1 200 C 25MAY2022
;
run;
Now, What I want is to introduce a new column which has final balance for each distinct account.
For Example: Lets take the first ACCOUNT_NO that is 1.
-> First we need to sort data based on ACCOUNT_NO
-> Now I want to introduce a new column(CH_BAL) which will be initialised with a value of 0 and
it will keep tap of the updated balance in the account
This column will work in such a way as below mentioned logic
If AMT_CRDR = 'C' then CH_BAL = CH_BAL + AMT_TXN;
if AMT_CRDR = 'D' then CH_BAL = CH_BAL - AMT_TXN;
(NOTE: ACCOUNT_NUMBERS ARE SORTED)
The above should work until the account number is same. But the moment the account number changes
It should again initialize the balance to zero and then goes into the cycle of analyzing if its a 'C' then add and if 'D' then subtract.
SORTED DATA
1 100 C 02APR2022
1 200 C 25MAY2022
2 45 C 03APR2022
2 90 D 04APR2022
3 23 D 05APR2022
3 120 C 23MAY2022
4 66 C 04APR2022
5 7 C 05APR2022
6 35 C 09APR2022
8 99 D 01MAY2022
8 100 D 22MAY2022
The final result should look like below.
ACCOUNT_NO | AMT_TXN | AMT_CRDR | CH_BAL |
1 | 100 | C | 100 |
1 | 200 | C | 300 |
2 | 45 | C | 45 |
2 | 90 | D | -45 |
3 | 23 | D | -23 |
3 | 120 | C | 97 |
4 | 66 | C | 66 |
5 | 7 | C | 7 |
6 | 35 | C | 35 |
8 | 99 | D | -99 |
8 | 100 | D | -199 |
Although I was able to attain the same if the account number is same using
below code.
data A;
input ACCOUNT_NO $ AMT_TXN $ AMT_CRDR $ DATE_TXN date7.;
format DATE_TXN date9.;
cards;
1 100 C 02APR2022
1 45 C 03APR2022
1 66 C 04APR2022
1 90 D 04APR2022
1 23 D 05APR2022
1 7 C 05APR2022
1 35 C 09APR2022
1 99 D 01MAY2022
1 100 D 22MAY2022
1 120 C 23MAY2022
1 200 C 25MAY2022
;
run;
data C;
set A;
by ACCOUNT_NO;
retain CH_BAL 0;
if AMT_CRDR EQ 'C' then CH_BAL = CH_BAL + AMT_TXN;
if AMT_CRDR EQ 'D' then CH_BAL = CH_BAL - AMT_TXN;
run;
The result of the above is as below screenshot.
Please, help me in approaching, solving and building logic for this problem.
Thanks in advance to all the contributors. Kindly, reply if anything is not clear. (Although I think I have made it pretty clear what is the problem.)
You're looking for if FIRST.<variable from by statement>
data C;
set A;
by ACCOUNT_NO;
retain CH_BAL;
if first.account_no then call missing(CH_BAL);
if AMT_CRDR EQ 'C' then
CH_BAL=sum(CH_BAL,AMT_TXN);
if AMT_CRDR EQ 'D' then
CH_BAL=sum(CH_BAL,-AMT_TXN);
run;
You're looking for if FIRST.<variable from by statement>
data C;
set A;
by ACCOUNT_NO;
retain CH_BAL;
if first.account_no then call missing(CH_BAL);
if AMT_CRDR EQ 'C' then
CH_BAL=sum(CH_BAL,AMT_TXN);
if AMT_CRDR EQ 'D' then
CH_BAL=sum(CH_BAL,-AMT_TXN);
run;
That's fully documented.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.
Find more tutorials on the SAS Users YouTube channel.