Hi Colleagues,
Attached data set (post_this2) shows 3 different accounts in 3 different banks.
When the bank_number, account_number and current_date are taken together only we can identify a unique record.
Q:
Is there any method that we could identify what are the accounts that still have a life even after they have fallen into “writoff’ arrears_band.
Answer:
Account_number ‘888888888’ in bank_number 100
(because this AC has fallen into “writoff’ arrears_band in 3JUN2011 and it again starts its life and continues for another period)
Also, Account_number ‘888888888’ in bank_number 40
(Because this AC has fallen into “writoff’ arrears_band in 28MAR2011 and it again its life and continues for several months)
Your help is greately apprecaited.
Thanks
Mirisage
Mirisage:
Are you saying bank 100 account 888888888 has life after 03jun2011, even though the only record after that date also has arrear_bands='writoff'? If so, then you are saying you want all accounts in which the first "writoff" record encountered does not end the time series.
data want (keep=bank_number account_number);
set have;
by bank_number account_number notsorted;
retain writoff_encountered 0;
if first.account_number then writoff_encountered=0;
if last.account_number and writoff_encountered=1 then output;
if arrears_bank='writoff' then writoff_encountered=1;
run;
**this program assumes that records are sorted by date within each banki/account group (which themselve might not be sorted).
Mirisage:
Are you saying bank 100 account 888888888 has life after 03jun2011, even though the only record after that date also has arrear_bands='writoff'? If so, then you are saying you want all accounts in which the first "writoff" record encountered does not end the time series.
data want (keep=bank_number account_number);
set have;
by bank_number account_number notsorted;
retain writoff_encountered 0;
if first.account_number then writoff_encountered=0;
if last.account_number and writoff_encountered=1 then output;
if arrears_bank='writoff' then writoff_encountered=1;
run;
**this program assumes that records are sorted by date within each banki/account group (which themselve might not be sorted).
Hi mkeintz,
Thank you very much for this code which ideally selects the subjects that satisfy my condition “I want all accounts in which the first "writoff" record encountered does not end the time series”.
( Last statement’s arrears_bank should be read as arrears_band )
If you have time, could you please elaborate a bit what this magic code does?
I can understand that you have done a “by group processing” which is done by “set” and “by” combination.
Also can understand when SAS finds the first.account_number then it creates a variable called writoff_encountered and its value is set to 0;
And when SAS finds the last.account_number then the created variable “writoff_encountered”’s value is set to 1;
But how the rest is happening to satisfy my condition.
What does “not sorted” do ?
Thank you
Mirisage
Mirisage:
The "not sorted" parameter of the BY statement tells SAS to not expect the data to be in ascending order, but still to expect the data to be "grouped". So you could have bank 2/acct2, then bank1/acct4, then bank 5/acct3 and SAS would not object. It would still determine the "first." and "last." variables at the beginning and ending of each group.
However "not sorted" also means SAS will also not care if the records for a given bank/acct are not in chronological order. That's why I said the program assumes that, within each bank/acct, the records are in date order.
As to how the rest works:
RETAIN writoff_encountered 0;
means SAS will not reset this new variable to a missing value at the top of the data set. It will keep whatever value it already has. The "0" means to start out with a zero - this aspect was superfluous.
At the beginning of a bank/acct group set this dummy var to zero.
Now "if arrears_bank-..." means upon such a record set the dummy to 1. It will stay a 1 throughout the bank/acct group (the retain statement).
Now consider if a bank/acct had only 1 "writoff" record and it was the last record. In this case you would not want an output record. But if I had put the "if last.account and writoff_encountered-1 then output" AFTER the "if arrears_bank='writoff" record, then that last record would have been output, counter to your request. So I put the "if .... output .." statement BEFORE checking for "writoff'.
Hi mkeintz,
This is a great explanation!
Thank you indeed for your time and sharing your expertise.
I think I have very very long way to go.
Thanks again!
Mirisage
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.