DATA Step, Macro, Functions and more

Lag Function

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Lag Function

 

Hello,

 

What is the easiest way to compare 30 variables (lag_debit1 to lag_debit30) against another 30 variables (lag_credit1 to lag_credit30) already created per account.

 

I only want the accounts which have an exact match between one or more.

 

thanks

Sally

 

 

 


Accepted Solutions
Solution
a week ago
Super User
Super User
Posts: 9,402

Re: Lag Function

Whilst you could do it with arrays, that wouldn't be optimal, especially if you get more rows.  What you want to do is to merge on any rows (distinct to get just one per value) which = debit, then flag those:

data test;
  informat trandate ANYDTDTE9. debit 8.2 credit 8.2 ;
  input accountnum trandate debit credit count ;
  format trandate DATE9.;
datalines;
100001 30APR2018 2000.00 . 1
100002 23MAY2018 3000.00 . 1
100002 26MAY2018 2500.50 . 2
100002 27MAY2018 . 2100.00 3
100002 28MAY2018 2100.00 . 4
100002 29MAY2018 2500.10 . 5
100003 23MAY2018 2450.10 . 1
100003 24MAY2018 . 2450.10 2
;
run;

proc sql;
  create table want as
  select a.*,
         case when b.credit ne . then "Y" else "" end as flag
  from   test a
  left join (select distinct credit from test) b
  on     a.debit=b.credit
  order by accountnum,count;
quit;

View solution in original post


All Replies
Valued Guide
Posts: 516

Re: Lag Function

Maybe the only things required is a loop, but because you forgot to post input data, the expected result and an explanation of what "compare" means, it is nearly impossible to suggest something solving your issue.

Super User
Super User
Posts: 9,402

Re: Lag Function

Post test data in the form of a datastep, otherwise we are just guessing.  If you have a list of variables with a numeric suffix then you will be talking about arrays or lists of values, you can define as:

data want;
  set have;
  array lag_debit(30);
  array lag_credit(30);
  do i=1 to 30;
    ...
  end;
run;

/* Or use a list directly as */
data want;
  set have;
  low=min(of lag_debit:);
run;

You can also remodel the data so you don't have a wide table, that would likely be far easier to manage and work with.

Contributor
Posts: 22

Re: Lag Function

Hi,

 

I need a bit more help.  If I use the code suggested, it returns 100% match, however this is not true. What do I need to change?

thks

 

data want;
  set rule7;
  array lag_debit(30);
  array lag_credit(30);
  do i=1 to 30;
if lag_debit(30) = lag_credit(30) then match=1;
  end;
run;

Valued Guide
Posts: 516

Re: Lag Function

Your code compares the 30th variables 30 times. Using "i" instead of "30" in the if-statement should help.

do i=1 to 30;
  if lag_debit(i) = lag_credit(i) then match=1;
end;
Contributor
Posts: 22

Re: Lag Function

Posted in reply to andreas_lds

Thanks but it looks like its incorrectly matching missing lag_debits to missing lag_credits.  How can i specify that I only want matches when there is a value present?

 

 

Super User
Posts: 9,874

Re: Lag Function

Add a check to the condition for missing. Compare with . (representation of a missing numeric value), or use the missing() function.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Super User
Posts: 9,402

Re: Lag Function

[ Edited ]

Again, seeing test data in a datastep really helps us understand your question, and showing what you want out. 

data want;
  set rule7;
  array lag_debit{30};
  array lag_credit{30};
array results{30}; do i=1 to 30; results{i}=ifn(lag_debit{i}=lag_credit{i},1,0); end; run;

 You will note (use of code window, its the {i} above post area), instead of putting 30 in the if part, i use the inceremental variable i, so each element is checked rather than just element 30.  Also, if you want a result for each pairing you need a result for each pairing, hence another array, otherwise you will get 1 if any of lag_debit{i} matches the same element in credit.  

Just to add, I find using curly braces around array's makes it easier to spot them amongst function calls, which is why I change them in the code.

Contributor
Posts: 22

Re: Lag Function

Thanks I understand that the input data would help, just haven't have time as yet.

 

It looks like the code is matching all the missing values and thinking that its a match, when really its just that there is no value (blank).

 

 

Super User
Super User
Posts: 9,402

Re: Lag Function

None of us have time, we are only volunteering on here - we don't work for SAS.  As such we have had this back and forth wasting time, where test data up front would have been quicker.

data want;
  set rule7;
  array lag_debit{30};
  array lag_credit{30};
array results{30}; do i=1 to 30; results{i}=ifn(lag_debit{i}=lag_credit{i} and lag_debit{i} ne .,1,0); end; run;
Super User
Posts: 9,874

Re: Lag Function


@Selli5 wrote:

Thanks I understand that the input data would help, just haven't have time as yet.

 


 


Then your issue can't be urgent at all. Otherwise, you would make sure to help us as much as you can in helping you.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 22

Re: Lag Function

Hello,

 

I have attached some input data and code below.

 

I would like to identify when there is an exact match between any lag_debits to any lag_credit (and exclude blanks).

 

So in this case account number 10002 should produce a match on the $2100 amount and account number 10003 should produce a match on $2450.10.

 

thanks

 

 

 

 

 

Attachment
Super User
Super User
Posts: 9,402

Re: Lag Function

So this has nothing to do with arrays at all then.  Ok, we see here why some test data, and your missing what the output should look like, does to help illustrate - note how I post code in a code window in the post:

data test;
informat trandate ANYDTDTE9. debit 8.2 credit 8.2 ;
input accountnum trandate debit credit count ;
format trandate DATE9.;
datalines;
100001 30APR2018 2000.00 . 1
100002 23MAY2018 3000.00 . 1
100002 26MAY2018 2500.50 . 2
100002 27MAY2018 . 2100.00 3
100002 28MAY2018 2100.00 . 4
100002 29MAY2018 2500.10 . 5
100003 23MAY2018 2450.10 . 1
100003 24MAY2018 . 2450.10 2
;
run;

So basically to flag the row if previous is like it then:

data want;
  set test;
  debit_chk=ifn(lag(debit)=debit,1,0);
  credit_chk=ifn(lag(credit)=credit,1,0);
run;
Contributor
Posts: 22

Re: Lag Function

Thanks, but that code would only allow comparing to the previous row of data.

 

I was using the arrays as I am trying to identify any matching debits (lag_debit1 to lag_debit5) to any credits (lag_credit1 to lag_credit5). 

 

In the final dataset (result) Account 10002 should produce a match for 2100 Amount (lag_debit4 = lag_credit3) and Account 10003 should produce a match for 2450.10 (lag_debit1 to lag_credit2).  However the results dataset is currently not identifying these matches and returns 0 for results 1 to results 5.

 

data test;
informat trandate ANYDTDTE9. debit 8.2 credit 8.2 ;
input accountnum trandate debit credit count ;
format trandate DATE9.;
datalines;
100001 30APR2018 2000.00 . 1
100002 23MAY2018 3000.00 . 1
100002 26MAY2018 2500.50 . 2
100002 27MAY2018 . 2100.00 3
100002 28MAY2018 2100.00 . 4
100002 29MAY2018 2500.10 . 5
100003 23MAY2018 2450.10 . 1
100003 24MAY2018 . 2450.10 2
;
run;

data want;
format lag_debit1-lag_debit5 8.2 lag_credit1-lag_credit5 8.2;
do until(last.accountnum);
set test;
by     accountnum trandate;
array lag_debit{5};
array lag_credit{5};
if first.accountnum then call missing(of lag_debit(*));
if first.accountnum then call missing(of lag_credit(*));
lag_debit(count)=debit;
lag_credit(count)=credit;
output;
end;
run;

data result;
  set want;
  array lag_debit{5};
  array lag_credit{5};
  array results{5};
  do i=1 to 5;

results{i}= ifn(lag_debit{i}=lag_credit{i} and lag_debit{i} ne ., 1, 0);
 end;
run;

proc print data=result;
var accountnum lag_debit1-lag_debit5 lag_credit1-lag_credit5  results1 - results5 i ;
run;

Solution
a week ago
Super User
Super User
Posts: 9,402

Re: Lag Function

Whilst you could do it with arrays, that wouldn't be optimal, especially if you get more rows.  What you want to do is to merge on any rows (distinct to get just one per value) which = debit, then flag those:

data test;
  informat trandate ANYDTDTE9. debit 8.2 credit 8.2 ;
  input accountnum trandate debit credit count ;
  format trandate DATE9.;
datalines;
100001 30APR2018 2000.00 . 1
100002 23MAY2018 3000.00 . 1
100002 26MAY2018 2500.50 . 2
100002 27MAY2018 . 2100.00 3
100002 28MAY2018 2100.00 . 4
100002 29MAY2018 2500.10 . 5
100003 23MAY2018 2450.10 . 1
100003 24MAY2018 . 2450.10 2
;
run;

proc sql;
  create table want as
  select a.*,
         case when b.credit ne . then "Y" else "" end as flag
  from   test a
  left join (select distinct credit from test) b
  on     a.debit=b.credit
  order by accountnum,count;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 206 views
  • 3 likes
  • 4 in conversation