BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasheadache
Obsidian | Level 7

Hi All, I don't think there is a 100% accurate solution here because of the rubbish data I have, so just looking to see if others have come up with any innovative solutions. 

 

Essentially, I have a table which includes both debit & credit transactions and where a payment has been reversed at some point I need to drop both entries from the final output.

 

The problem I have is there is no way to directly link the debit & credit transactions together. Whilst the amounts will be the same (i.e. +£100 & -£100) they don't always appear on either the same date or on consecutive date for me to be able to group them together and then drop them. 

 

This is an example of how the table is structured and as you will see.... Ref: 001 is cancelled out by Ref: 004 and Ref: 002 is cancelled out by Ref: 007.

 

Transaction Ref

Transaction Date

Account RefC/DTransaction Amt*Notes (not in table)

Ref 1

01/02/2024

Account ACREDIT100 
Ref 2 07/02/2024Account ACREDIT100 
Ref 314/02/2024Account ACREDIT50 
Ref 415/02/2024Account ADEBIT100*reversal of 01/01/2024
Ref 516/02/2024Account ACREDIT50 
Ref 6

21/02/2024

Account ACREDIT100 
Ref 7

23/02/2024

Account ADEBIT100*reversal of 07/01/2024

 

Welcome any expert views/thoughts.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
  input ref date :ddmmyy10. account :$1.  cd :$6.  amt;
  format date date9.;
datalines;
1  01/02/2024 A CREDIT 100	 
2  07/02/2024 A CREDIT 100	 
3  14/02/2024 A CREDIT  50	 
4  15/02/2024 A DEBIT  100	*reversal of 01/01/2024
5  16/02/2024 A CREDIT  50	 
6  21/02/2024 A CREDIT 100	 
7  23/02/2024 A DEBIT  100	*reversal of 07/01/2024
;

data CREDIT(rename=(ref=_ref date=_date account=_account cd=_cd amt=_amt)) DEBIT;
 set have;
 if cd='CREDIT' then output CREDIT;
 if cd='DEBIT' then output DEBIT;
run;

data _null_;
if _n_=1 then do;
 if 0 then set CREDIT;
 declare hash h(dataset:'CREDIT',ordered:'a');
 declare hiter hi('h');
 h.definekey('_date');
 h.definedata('_ref','_date','_account','_cd','_amt');
 h.definedone();
end;
set DEBIT end=last;
do while(hi.next()=0); 
 if amt=_amt and not found then do;found=1;key=_date;rc=hi.next();rc=h.remove(key:key);end;
end;
if last then h.output(dataset:'want');
run;

View solution in original post

9 REPLIES 9
mkeintz
PROC Star

Your explanatory note for Ref 4  (a debit of £100), says "reversal of 01/01/2024".  There is no £100 credit with a date of 01/01/2024.  But there are two preceding £100 credits.  What rule do you propose to use to find a credit matching the debit, when there are multiple possible matches.  

 

And what do you want the output dataset to look like?  Same as the output, but with the matches removed?

 

Editted addition:

Also do you expect that every credit will have a matching debit?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

This program removes observations in pairs, namely all DEBITs and earlier matching CREDITs.  The oldest possible matching CREDIT is the one that gets removed.  Unmatched DEBITs remain in the data set:

 

data have;
  input ref date :ddmmyy10. account :$1.  cd :$6.  amt;
  format date date9.;
datalines;
1  01/02/2024 A CREDIT 100	 
2  07/02/2024 A CREDIT 100	 
3  14/02/2024 A CREDIT  50	 
4  15/02/2024 A DEBIT  100	*reversal of 01/01/2024
5  16/02/2024 A CREDIT  50	 
6  21/02/2024 A CREDIT 100	 
7  23/02/2024 A DEBIT  100	*reversal of 07/01/2024
run;

data want (drop=i j _:);
 array _history{30} ;         /* Large enough to hold entire history */
 do i=1 by 1 until (last.account);
    set have;
    by account;
    if cd='DEBIT' then amt=-1*amt;      /* Positive for credit, Negative for debit*/
    _history{i}=amt;
    if cd='DEBIT' then do j=1 to i-1 ;  /* Debit?  then review history */
      if _history{j} ^= -1*_history{i} then continue;
      call missing(_history{j},_history{i});
      leave;
    end;
  end;

  do i=1 by 1 until (last.account);
    set have;
    by account;
    if _history{i}^=. then output;
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Could you post the output you want to see ?
sasheadache
Obsidian | Level 7
Transaction RefTransaction DateAccount RefC/DTransaction Amt
Ref 314/02/2024Account ACREDIT50
Ref 516/02/2024Account ACREDIT50
Ref 621/02/2024Account ACREDIT100

 

As you will see transactions 1,2, 4,7 have been dropped because they have cancelled each other out. 

sasheadache
Obsidian | Level 7
Apologies for the typo - the notes should show reversals for transactions dated 01/02 & 07/02 and not January.
Ksharp
Super User

Sometime, if you have the data like following,

What you gonna do with it ?

 

Ksharp_0-1708504358260.png

 

sasheadache
Obsidian | Level 7

Reversals will never be combined - so in theory the values will always match i.e. Credit £100 & Debit £100.

 

The main problem is other debits and credits can appear in-between which are not cancelled out. (These are what I want to retain as I am classifying these as true payments)

 

If you think of how your day to day banking works.... You might buy something today, but then return it and get a refund in 30 days time.  I am wanting to drop both transactions from my list, but without a true reference to link them together. (annoying I know)

mkeintz
PROC Star

This is slightly simpler code than my earlier response.  It expects data to be in chronological order within each account, and will retain unmatched debits in the output dataset:

 

data want (drop=i j);
  array history{30} _temporary_;  /*Make large enough for entire history*/
  set have (in=build_history)   have (in=access_history);
  by account;

  /* Initialize history pointer at start of account and start of build_access*/
  if first.account=1 or dif(access_history)=1 then i=0; 
  i+1;
  if first.account then call missing(of history{*});  /*Strictly speaking, this is superfluous*/

  if build_history then do;
    history{i}=ifn(cd='DEBIT',-1*amt,amt);
    if cd='DEBIT' then do j=1 to i-1 until (history{i}=.);
      if history{j}=-1*history{i} then call missing(history{j},history{i});
    end;
  end;
  if access_history=1 and history{i}^=.;  /* Keep the uncancelled transactions*/
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
  input ref date :ddmmyy10. account :$1.  cd :$6.  amt;
  format date date9.;
datalines;
1  01/02/2024 A CREDIT 100	 
2  07/02/2024 A CREDIT 100	 
3  14/02/2024 A CREDIT  50	 
4  15/02/2024 A DEBIT  100	*reversal of 01/01/2024
5  16/02/2024 A CREDIT  50	 
6  21/02/2024 A CREDIT 100	 
7  23/02/2024 A DEBIT  100	*reversal of 07/01/2024
;

data CREDIT(rename=(ref=_ref date=_date account=_account cd=_cd amt=_amt)) DEBIT;
 set have;
 if cd='CREDIT' then output CREDIT;
 if cd='DEBIT' then output DEBIT;
run;

data _null_;
if _n_=1 then do;
 if 0 then set CREDIT;
 declare hash h(dataset:'CREDIT',ordered:'a');
 declare hiter hi('h');
 h.definekey('_date');
 h.definedata('_ref','_date','_account','_cd','_amt');
 h.definedone();
end;
set DEBIT end=last;
do while(hi.next()=0); 
 if amt=_amt and not found then do;found=1;key=_date;rc=hi.next();rc=h.remove(key:key);end;
end;
if last then h.output(dataset:'want');
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1546 views
  • 4 likes
  • 3 in conversation