Help using Base SAS procedures

Can we find the next value of a longitudinal dataset is different from the previous value?

Accepted Solution Solved
Reply
Super Contributor
Posts: 338
Accepted Solution

Can we find the next value of a longitudinal dataset is different from the previous value?

Hi Colleagues,

Attached data set (post_this2) shows 3 different accounts in 3 different banks.

When the bank_number,  account_number and current_date are taken together only we can identify a unique record.

Q:

Is there any method that we could identify what are the accounts that still have a life even after they have fallen into “writoff’ arrears_band.

Answer:

Account_number ‘888888888’ in bank_number 100

(because this AC has fallen into “writoff’ arrears_band in 3JUN2011 and it again starts its life and continues for another period)

Also, Account_number ‘888888888’ in bank_number 40

(Because this AC has fallen into “writoff’ arrears_band in 28MAR2011 and it again its life and continues for several months)

Your help is greately apprecaited.

Thanks

Mirisage

Attachment

Accepted Solutions
Solution
‎08-23-2012 02:58 PM
Valued Guide
Posts: 797

Re: Can we find the next value of a longitudinal dataset is different from the previous value?

Mirisage:

Are you saying bank 100 account 888888888 has life after 03jun2011, even though the only record after that date also has arrear_bands='writoff'?   If so, then you are saying you want all accounts in which the first "writoff" record encountered does not end the time series.

data want (keep=bank_number account_number);

  set have;

  by bank_number account_number notsorted;

  retain writoff_encountered 0;

  if first.account_number then writoff_encountered=0;

  if last.account_number and writoff_encountered=1 then output;

  if arrears_bank='writoff' then writoff_encountered=1;

run;

**this program assumes that records are sorted by date  within each banki/account group (which themselve might not be sorted).

View solution in original post


All Replies
Solution
‎08-23-2012 02:58 PM
Valued Guide
Posts: 797

Re: Can we find the next value of a longitudinal dataset is different from the previous value?

Mirisage:

Are you saying bank 100 account 888888888 has life after 03jun2011, even though the only record after that date also has arrear_bands='writoff'?   If so, then you are saying you want all accounts in which the first "writoff" record encountered does not end the time series.

data want (keep=bank_number account_number);

  set have;

  by bank_number account_number notsorted;

  retain writoff_encountered 0;

  if first.account_number then writoff_encountered=0;

  if last.account_number and writoff_encountered=1 then output;

  if arrears_bank='writoff' then writoff_encountered=1;

run;

**this program assumes that records are sorted by date  within each banki/account group (which themselve might not be sorted).

Super Contributor
Posts: 338

Re: Can we find the next value of a longitudinal dataset is different from the previous value?

Hi mkeintz,

Thank you very much for this code which ideally selects the subjects that satisfy my condition “I want all accounts in which the first "writoff" record encountered does not end the time series”.

( Last statement’s arrears_bank should be read as   arrears_band   )

If you have time, could you please elaborate a bit what this magic code does?

I can understand that you have done a “by group processing” which is done by “set” and “by” combination.

Also can understand when SAS finds the  first.account_number then it creates a variable called writoff_encountered and its value is set to 0;

And when SAS finds the  last.account_number then the created  variable “writoff_encountered”’s  value is set to 1;

But how the rest is happening to satisfy my condition.

What does “not sorted” do ?

Thank you

Mirisage

Valued Guide
Posts: 797

Re: Can we find the next value of a longitudinal dataset is different from the previous value?

Mirisage:

The "not sorted" parameter of the BY statement tells SAS to not expect the data to be in ascending order, but still to expect the data to be "grouped".  So you could have bank 2/acct2, then bank1/acct4, then bank 5/acct3 and SAS would not object.  It would still determine the "first." and "last." variables at the beginning and ending of each group.

However "not sorted" also means SAS will also not care if the records for a given bank/acct are not in chronological order.  That's why I said the program assumes that, within each bank/acct, the records are in date order.

As to how the rest works:

   RETAIN writoff_encountered 0; 
means SAS will not reset this new variable to a missing value at the top of the data set.  It will keep whatever value it already has.  The "0" means to start out with a zero - this aspect was superfluous.

At the beginning of a bank/acct group set this dummy var to zero.

Now "if arrears_bank-..." means upon such a record set the dummy to 1.  It will stay a 1 throughout the bank/acct group (the retain statement).

Now consider if a bank/acct had only 1 "writoff" record and it was the last record.  In this case you would not want an output record.  But if I had put the "if last.account and writoff_encountered-1 then output"  AFTER the "if arrears_bank='writoff" record, then that last record would have been output, counter to your request.  So I put the "if .... output .." statement BEFORE checking for "writoff'.

Super Contributor
Posts: 338

Re: Can we find the next value of a longitudinal dataset is different from the previous value?

Hi mkeintz,

This is a great explanation!

Thank you indeed for your time and sharing your expertise.

I think I have very very long way to go.

Thanks again!

Mirisage

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 224 views
  • 3 likes
  • 2 in conversation