Hi All, I don't think there is a 100% accurate solution here because of the rubbish data I have, so just looking to see if others have come up with any innovative solutions.
Essentially, I have a table which includes both debit & credit transactions and where a payment has been reversed at some point I need to drop both entries from the final output.
The problem I have is there is no way to directly link the debit & credit transactions together. Whilst the amounts will be the same (i.e. +£100 & -£100) they don't always appear on either the same date or on consecutive date for me to be able to group them together and then drop them.
This is an example of how the table is structured and as you will see.... Ref: 001 is cancelled out by Ref: 004 and Ref: 002 is cancelled out by Ref: 007.
Transaction Ref | Transaction Date | Account Ref | C/D | Transaction Amt | *Notes (not in table) |
Ref 1 | 01/02/2024 | Account A | CREDIT | 100 | |
Ref 2 | 07/02/2024 | Account A | CREDIT | 100 | |
Ref 3 | 14/02/2024 | Account A | CREDIT | 50 | |
Ref 4 | 15/02/2024 | Account A | DEBIT | 100 | *reversal of 01/01/2024 |
Ref 5 | 16/02/2024 | Account A | CREDIT | 50 | |
Ref 6 | 21/02/2024 | Account A | CREDIT | 100 | |
Ref 7 | 23/02/2024 | Account A | DEBIT | 100 | *reversal of 07/01/2024 |
Welcome any expert views/thoughts.
data have;
input ref date :ddmmyy10. account :$1. cd :$6. amt;
format date date9.;
datalines;
1 01/02/2024 A CREDIT 100
2 07/02/2024 A CREDIT 100
3 14/02/2024 A CREDIT 50
4 15/02/2024 A DEBIT 100 *reversal of 01/01/2024
5 16/02/2024 A CREDIT 50
6 21/02/2024 A CREDIT 100
7 23/02/2024 A DEBIT 100 *reversal of 07/01/2024
;
data CREDIT(rename=(ref=_ref date=_date account=_account cd=_cd amt=_amt)) DEBIT;
set have;
if cd='CREDIT' then output CREDIT;
if cd='DEBIT' then output DEBIT;
run;
data _null_;
if _n_=1 then do;
if 0 then set CREDIT;
declare hash h(dataset:'CREDIT',ordered:'a');
declare hiter hi('h');
h.definekey('_date');
h.definedata('_ref','_date','_account','_cd','_amt');
h.definedone();
end;
set DEBIT end=last;
do while(hi.next()=0);
if amt=_amt and not found then do;found=1;key=_date;rc=hi.next();rc=h.remove(key:key);end;
end;
if last then h.output(dataset:'want');
run;
Your explanatory note for Ref 4 (a debit of £100), says "reversal of 01/01/2024". There is no £100 credit with a date of 01/01/2024. But there are two preceding £100 credits. What rule do you propose to use to find a credit matching the debit, when there are multiple possible matches.
And what do you want the output dataset to look like? Same as the output, but with the matches removed?
Editted addition:
Also do you expect that every credit will have a matching debit?
This program removes observations in pairs, namely all DEBITs and earlier matching CREDITs. The oldest possible matching CREDIT is the one that gets removed. Unmatched DEBITs remain in the data set:
data have;
input ref date :ddmmyy10. account :$1. cd :$6. amt;
format date date9.;
datalines;
1 01/02/2024 A CREDIT 100
2 07/02/2024 A CREDIT 100
3 14/02/2024 A CREDIT 50
4 15/02/2024 A DEBIT 100 *reversal of 01/01/2024
5 16/02/2024 A CREDIT 50
6 21/02/2024 A CREDIT 100
7 23/02/2024 A DEBIT 100 *reversal of 07/01/2024
run;
data want (drop=i j _:);
array _history{30} ; /* Large enough to hold entire history */
do i=1 by 1 until (last.account);
set have;
by account;
if cd='DEBIT' then amt=-1*amt; /* Positive for credit, Negative for debit*/
_history{i}=amt;
if cd='DEBIT' then do j=1 to i-1 ; /* Debit? then review history */
if _history{j} ^= -1*_history{i} then continue;
call missing(_history{j},_history{i});
leave;
end;
end;
do i=1 by 1 until (last.account);
set have;
by account;
if _history{i}^=. then output;
end;
run;
Transaction Ref | Transaction Date | Account Ref | C/D | Transaction Amt |
Ref 3 | 14/02/2024 | Account A | CREDIT | 50 |
Ref 5 | 16/02/2024 | Account A | CREDIT | 50 |
Ref 6 | 21/02/2024 | Account A | CREDIT | 100 |
As you will see transactions 1,2, 4,7 have been dropped because they have cancelled each other out.
Sometime, if you have the data like following,
What you gonna do with it ?
Reversals will never be combined - so in theory the values will always match i.e. Credit £100 & Debit £100.
The main problem is other debits and credits can appear in-between which are not cancelled out. (These are what I want to retain as I am classifying these as true payments)
If you think of how your day to day banking works.... You might buy something today, but then return it and get a refund in 30 days time. I am wanting to drop both transactions from my list, but without a true reference to link them together. (annoying I know)
This is slightly simpler code than my earlier response. It expects data to be in chronological order within each account, and will retain unmatched debits in the output dataset:
data want (drop=i j);
array history{30} _temporary_; /*Make large enough for entire history*/
set have (in=build_history) have (in=access_history);
by account;
/* Initialize history pointer at start of account and start of build_access*/
if first.account=1 or dif(access_history)=1 then i=0;
i+1;
if first.account then call missing(of history{*}); /*Strictly speaking, this is superfluous*/
if build_history then do;
history{i}=ifn(cd='DEBIT',-1*amt,amt);
if cd='DEBIT' then do j=1 to i-1 until (history{i}=.);
if history{j}=-1*history{i} then call missing(history{j},history{i});
end;
end;
if access_history=1 and history{i}^=.; /* Keep the uncancelled transactions*/
run;
data have;
input ref date :ddmmyy10. account :$1. cd :$6. amt;
format date date9.;
datalines;
1 01/02/2024 A CREDIT 100
2 07/02/2024 A CREDIT 100
3 14/02/2024 A CREDIT 50
4 15/02/2024 A DEBIT 100 *reversal of 01/01/2024
5 16/02/2024 A CREDIT 50
6 21/02/2024 A CREDIT 100
7 23/02/2024 A DEBIT 100 *reversal of 07/01/2024
;
data CREDIT(rename=(ref=_ref date=_date account=_account cd=_cd amt=_amt)) DEBIT;
set have;
if cd='CREDIT' then output CREDIT;
if cd='DEBIT' then output DEBIT;
run;
data _null_;
if _n_=1 then do;
if 0 then set CREDIT;
declare hash h(dataset:'CREDIT',ordered:'a');
declare hiter hi('h');
h.definekey('_date');
h.definedata('_ref','_date','_account','_cd','_amt');
h.definedone();
end;
set DEBIT end=last;
do while(hi.next()=0);
if amt=_amt and not found then do;found=1;key=_date;rc=hi.next();rc=h.remove(key:key);end;
end;
if last then h.output(dataset:'want');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.