Hi,
I am trying to write a code for below example of data:
For each time ID of accounts, that have a balance greater than "0" in any one of the two previous months (not current) how can we create a flag ?
time id | account number | balance |
Jan-18 | 5 | 50 |
Feb-18 | 5 | 50 |
Mar-18 | 5 | 50 |
Apr-18 | 5 | 50 |
May-18 | 5 | 30 |
Jun-18 | 5 | . |
Jul-18 | 5 | . |
Aug-18 | 5 | 40 |
Sep-18 | 5 | 34 |
Jan-18 | 8 | 96 |
Feb-18 | 8 | 67 |
Mar-18 | 8 | 78 |
Apr-18 | 8 | 12 |
For the PROC EXPAND solution, I thought what you asked for was OR rather than & the condition(s) are true ... a small tweak.
If you don't have PROC EXPAND, it gets a little clumsy. Here's a possibility:
data want;
set have;
by account_number time_id;
if first.account_number then recnum=1;
else recnum + 1;
flag_back1 = ( lag(balance) > 0 ) * (recnum > 1);
flag_back2 = ( lag2(balance) > 0 ) * (recnum > 2);
flag = max(flag_back1, flag_back2);
drop recnum flag_back1 flag_back2;
run;
One way..
data have;
informat timeid monyy5.;
format timeid monyy5.;
input timeid accountnumber balance;
datalines;
Jan-18 5 50
Feb-18 5 50
Mar-18 5 50
Apr-18 5 50
May-18 5 30
Jun-18 5 .
Jul-18 5 .
Aug-18 5 40
Sep-18 5 34
Jan-18 8 96
Feb-18 8 67
Mar-18 8 78
Apr-18 8 12
;
proc sort data=have;
by accountnumber timeid;
run;
proc expand data=have out=temp method=none;
by accountnumber;
id timeid;
convert balance = lag1balance / transformout=(lag 1);
convert balance = lag2balance / transformout=(lag 2);
run;quit;
data want;
set temp;
flag=ifn(lag1balance>0 & lag2balance>0, 1, 0);
drop lag:;
run;
This seems like a very good solution. It uses the BY statement, which ought to mean that it handles the case of the first observation of an accountnumber and the second observation of an accountnumber. This is superior to what I was thinking of using the LAG and LAG2 functions, which doesn't directly handle cases where you have the first or 2nd observation of an accountnumber, and you'd have to program in the proper logic to handle those cases. The only drawback is that not everyone has a license that includes PROC EXPAND.
Thank you, if I wanna look at back to 18 months instead of 2 months, do you have another solution using loop?
When I run this code, I am getting for 5 account number in Feb 2018 as 0 flag. However, I want to see for this account as 1 since it has Jan-2018 Balance already.
For the PROC EXPAND solution, I thought what you asked for was OR rather than & the condition(s) are true ... a small tweak.
If you don't have PROC EXPAND, it gets a little clumsy. Here's a possibility:
data want;
set have;
by account_number time_id;
if first.account_number then recnum=1;
else recnum + 1;
flag_back1 = ( lag(balance) > 0 ) * (recnum > 1);
flag_back2 = ( lag2(balance) > 0 ) * (recnum > 2);
flag = max(flag_back1, flag_back2);
drop recnum flag_back1 flag_back2;
run;
Thank you very much!!!!!
This worked on SAS. I would like to ask one question, if I want to create flag back to 18 months, how can I do this?
Well, it's not too hard to add a bunch of statements (the exact number is up to you):
flag_back2 = ( lag2(balance) > 0 ) * (recnum > 2);
...
flag_back16 = (lag16(balance) > 0) * (recnum > 16);
Then end with:
flag = max(of flag_back1-flag_back18);
drop recnum flag_back: ;
In general, however, ask questions. You should understand how a solution works if you want to feel comfortable and safe using it.
Macro language is a good way to generate the statements. I think you need to add %END to match with %DO.
Also decide what to do with negative balances.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.