BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mvalsamis
Obsidian | Level 7

Good evening from England! I am using DI Studio and I am facing the following issue. I have a table named TRANSACTIONS that shows all the transactions of several bank accounts within 2021. For simplicity, let's assume I have only one account and its transactions throughout January 2021. Assume also that the account holder performed transactions only on 3rd, 10th, 17th and 22nd of January. The TRANSACTIONS table is the following:

date account_n start_balance trans_cr trans_dr
01/01/2021 123456 300 . .
02/01/2021 123456 . . .
03/01/2021 123456 . 200 90 
04/01/2021 123456 . . .
05/01/2021 123456 . . .
06/01/2021 123456 . . .
07/01/2021 123456 . . .
08/01/2021 123456 . . .
09/01/2021 123456 . . .
10/01/2021 123456 . 30 10
11/01/2021 123456 . . .
12/01/2021 123456 . . .
13/01/2021 123456 . . .
14/01/2021 123456 . . .
15/01/2021 123456 . . .
16/01/2021 123456 . . .
17/01/2021 123456 . 30 70
18/01/2021 123456 . . .
19/01/2021 123456 . . .
20/01/2021 123456 . . .
21/01/2021 123456 . . .
22/01/2021 123456 . 20 10
23/01/2021 123456 . . .
24/01/2021 123456 . . .
25/01/2021 123456 . . .
26/01/2021 123456 . . .
27/01/2021 123456 . . .
28/01/2021 123456 . . .
29/01/2021 123456 . . .
30/01/2021 123456 . . .
31/01/2021 123456 . . .

 

Where date: the date of transaction in DDMMYY10. format

              account_n: the account number

              start_balance: the starting balance in the beginning of the year, i.e. 01/01/2021

              trans_cr: the total credit transactions of the day

              trans_dr: the total debit transactions of the day

 

From the TRANSACTIONS table above I want to create the following table, named CALC_BALS:

date  account_n endday_balance trans_cr trans_dr
01/01/2021 123456 300 0 0
02/01/2021 123456 300 0 0
03/01/2021 123456 410 200 90
04/01/2021 123456 410 0 0
05/01/2021 123456 410 0 0
06/01/2021 123456 410 0 0
07/01/2021 123456 410 0 0
08/01/2021 123456 410 0 0
09/01/2021 123456 410 0 0
10/01/2021 123456 430 30 10
11/01/2021 123456 430 0 0
12/01/2021 123456 430 0 0
13/01/2021 123456 430 0 0
14/01/2021 123456 430 0 0
15/01/2021 123456 430 0 0
16/01/2021 123456 430 0 0
17/01/2021 123456 390 30 70
18/01/2021 123456 390 0 0
19/01/2021 123456 390 0 0
20/01/2021 123456 390 0 0
21/01/2021 123456 390 0 0
22/01/2021 123456 400 20 10
23/01/2021 123456 400 0 0
24/01/2021 123456 400 0 0
25/01/2021 123456 400 0 0
26/01/2021 123456 400 0 0
27/01/2021 123456 400 0 0
28/01/2021 123456 400 0 0
29/01/2021 123456 400 0 0
30/01/2021 123456 400 0 0
31/01/2021 123456 400 0 0

 

The created variable here is endday_balance, which is the balance at the end of the day, and it is defined as:

start of day balance (ie previous endday_balance)+ trans_cr - trans_dr

 

As it is shown, I want to fill in the missing values of endday_balance with its latest value. I am struggling with these 2 problems of filling values and using a lagged value. Do you have any idea of how to do it? I would prefer to use DI transformations, but if not possible, I can use manual code. Thank you in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data want;
    set have;
    if not missing(start_balance) or not missing(trans_cr) or not missing(trans_dr) then
         end_of_day_balance + sum(start_balance,trans_cr,-trans_dr);
    if missing(trans_cr) then trans_cr=0;
    if missing(trans_dr) then trans_dr=0;
run;

 

This code is untested. If you want tested code, please provide data as working SAS data step code (examples and instructions) and NOT as Excel files and NOT as copy and paste from Excel. 

 

You said: "For simplicity, let's assume I have only one account". This is a poor assumption, and the above code will not work if you have more than one account. In general, if the real life problem has more than 1 account, please provide data for at least two accounts.

 

Modification for more than 1 account (again, untested)

 

data want;
    set have;
    by account_n;
    if first.account_n then end_of_day_balance=0;
    if not missing(start_balance) or not missing(trans_cr) or not missing(trans_dr) then
         end_of_day_balance + sum(start_balance,trans_cr,-trans_dr);
    if missing(trans_cr) then trans_cr=0;
    if missing(trans_dr) then trans_dr=0;
run;
--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26
data want;
    set have;
    if not missing(start_balance) or not missing(trans_cr) or not missing(trans_dr) then
         end_of_day_balance + sum(start_balance,trans_cr,-trans_dr);
    if missing(trans_cr) then trans_cr=0;
    if missing(trans_dr) then trans_dr=0;
run;

 

This code is untested. If you want tested code, please provide data as working SAS data step code (examples and instructions) and NOT as Excel files and NOT as copy and paste from Excel. 

 

You said: "For simplicity, let's assume I have only one account". This is a poor assumption, and the above code will not work if you have more than one account. In general, if the real life problem has more than 1 account, please provide data for at least two accounts.

 

Modification for more than 1 account (again, untested)

 

data want;
    set have;
    by account_n;
    if first.account_n then end_of_day_balance=0;
    if not missing(start_balance) or not missing(trans_cr) or not missing(trans_dr) then
         end_of_day_balance + sum(start_balance,trans_cr,-trans_dr);
    if missing(trans_cr) then trans_cr=0;
    if missing(trans_dr) then trans_dr=0;
run;
--
Paige Miller
mvalsamis
Obsidian | Level 7

Thanks for the advice! I will test the code!

mvalsamis
Obsidian | Level 7

Thanks! I will try the code!

mkeintz
PROC Star

Untested, in the absence of sample data in the form of a working data step:

 

data want;
  set have;
  by account_n;
  trans_cr=coalesce(trans_cr,0);
  trans_dr=coalesce(trans_dr,0);

  retain endday_balance;
  if first.account_n then endday_balance=start_balance;
  else endday_balance + trans_cr +(-1*trans_dr);
run;

The above assumes data are sorted by account_n/date, and that the first obs for each account has a valid start_balance value, but no values for trans_cr and trans_dr.  Also no other obs has a valid start_balance value.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 852 views
  • 5 likes
  • 3 in conversation