DATA Step, Macro, Functions and more

Find First and Last Dates in a dataset with repeating groups

Reply
Contributor
Posts: 58

Find First and Last Dates in a dataset with repeating groups

I have a dataset with many loans having 6 different statuses.  There dataset has a chronological source date (Work Days).  I have code that finds the first and last day for each Status but, I am unable to capture the correct detail (correct groupings) when the status moves backward chronologically (because this then creates multiple groupings for the same status).

 

Here is my desired / correct result:

Ref NoXX00405201F_DayL_Day
1LOCKED4/6/20165/10/2016
2APPRVD5/11/20165/19/2016
3OTC5/20/20165/20/2016
4CLOSED5/23/20165/28/2016
5READY5/31/20166/29/2016
6POOLED6/30/20166/30/2016
7CLOSED7/1/20168/1/2016
8READY8/2/20168/29/2016
9POOLED8/30/20168/30/2016

 

Here is the result of my code:

Ref NoXX00405201F_DayL_Day
1LOCKED4/6/20165/10/2016
2APPRVD5/11/20165/19/2016
3OTC5/20/20165/20/2016
4CLOSED5/23/20168/1/2016
5READY5/31/20168/29/2016
6POOLED6/30/20168/30/2016

 

Notice that I can only get 6 rows. I need 9, 1 extra grouping for each: CLOSED, READY,POOLED.

 

This is my code and it  only allows me only 1 grouping / Sort Order:  StatusNo

Data A_2300405201_GRP;

set A_2300405201;

by StatusNo;

 

If first.StatusNo then Do

F_Stat_Day=Source_Date_dte;

Output;

end;

 

If last.StatusNo then Do

L_Stat_Day=Source_Date_dte;

Output;

end;

 

format F_Stat_Day mmddyy8. L_Stat_Day mmddyy8.;

run;

 

1.  How can I get the F_Stat_Day and L_Stat_Day each time that the Status changes throughout the chronology of the Loan, EVEN IF IT IS A REPEAT STATUS?

 

2.  I also want to evaluate ALL Loans in the dataset in the same manner.  The above code only evaluates 1 Loan.

 

Many Thanks

 

Trusted Advisor
Posts: 1,586

Re: Find First and Last Dates in a dataset with repeating groups

Posted in reply to Kody_devl

Can you upload the input dataset with two loans, displaying variable names and the order of the observations.

Contributor
Posts: 58

Re: Find First and Last Dates in a dataset with repeating groups

Other Loans in the Dataset would look the same with differing patterns of Statues.

 

Loan_NumberSource_Date_dteStatusNoStatus
XX004052014/6/20161LOCKED
XX004052014/7/20161LOCKED
XX004052014/8/20161LOCKED
XX004052014/11/20161LOCKED
XX004052014/12/20161LOCKED
XX004052014/13/20161LOCKED
XX004052014/14/20161LOCKED
XX004052014/15/20161LOCKED
XX004052014/18/20161LOCKED
XX004052014/19/20161LOCKED
XX004052014/20/20161LOCKED
XX004052014/21/20161LOCKED
XX004052014/22/20161LOCKED
XX004052014/25/20161LOCKED
XX004052014/26/20161LOCKED
XX004052014/27/20161LOCKED
XX004052014/28/20161LOCKED
XX004052014/29/20161LOCKED
XX004052015/2/20161LOCKED
XX004052015/3/20161LOCKED
XX004052015/4/20161LOCKED
XX004052015/5/20161LOCKED
XX004052015/6/20161LOCKED
XX004052015/9/20161LOCKED
XX004052015/10/20161LOCKED
XX004052015/11/20162APPRVD
XX004052015/12/20162APPRVD
XX004052015/13/20162APPRVD
XX004052015/16/20162APPRVD
XX004052015/17/20162APPRVD
XX004052015/18/20162APPRVD
XX004052015/19/20162APPRVD
XX004052015/20/20163OTC
XX004052015/23/20164CLOSED
XX004052015/24/20164CLOSED
XX004052015/25/20164CLOSED
XX004052015/26/20164CLOSED
XX004052015/27/20164CLOSED
XX004052015/28/20164CLOSED
XX004052015/31/20165READY
XX004052016/2/20165READY
XX004052016/3/20165READY
XX004052016/6/20165READY
XX004052016/7/20165READY
XX004052016/8/20165READY
XX004052016/9/20165READY
XX004052016/10/20165READY
XX004052016/13/20165READY
XX004052016/14/20165READY
XX004052016/15/20165READY
XX004052016/16/20165READY
XX004052016/17/20165READY
XX004052016/20/20165READY
XX004052016/21/20165READY
XX004052016/22/20165READY
XX004052016/23/20165READY
XX004052016/24/20165READY
XX004052016/27/20165READY
XX004052016/28/20165READY
XX004052016/29/20165READY
XX004052016/30/20166POOLED
XX004052017/1/20164CLOSED
XX004052017/2/20164CLOSED
XX004052017/6/20164CLOSED
XX004052017/7/20164CLOSED
XX004052017/8/20164CLOSED
XX004052017/11/20164CLOSED
XX004052017/12/20164CLOSED
XX004052017/13/20164CLOSED
XX004052017/14/20164CLOSED
XX004052017/15/20164CLOSED
XX004052017/18/20164CLOSED
XX004052017/19/20164CLOSED
XX004052017/20/20164CLOSED
XX004052017/21/20164CLOSED
XX004052017/22/20164CLOSED
XX004052017/25/20164CLOSED
XX004052017/26/20164CLOSED
XX004052017/27/20164CLOSED
XX004052017/28/20164CLOSED
XX004052017/29/20164CLOSED
XX004052018/1/20164CLOSED
XX004052018/2/20165READY
XX004052018/3/20165READY
XX004052018/4/20165READY
XX004052018/5/20165READY
XX004052018/8/20165READY
XX004052018/9/20165READY
XX004052018/10/20165READY
XX004052018/11/20165READY
XX004052018/12/20165READY
XX004052018/15/20165READY
XX004052018/16/20165READY
XX004052018/17/20165READY
XX004052018/18/20165READY
XX004052018/19/20165READY
XX004052018/22/20165READY
XX004052018/23/20165READY
XX004052018/24/20165READY
XX004052018/25/20165READY
XX004052018/26/20165READY
XX004052018/27/20165READY
XX004052018/29/20165READY
XX004052018/30/20166POOLED
Trusted Advisor
Posts: 1,586

Re: Find First and Last Dates in a dataset with repeating groups

Posted in reply to Kody_devl

I assume the dataset is sorted by: lOAN_NUMBAER  and SOURCE_DATE_DTE inside the loan.

 

Here is your code with small changes emphasized by underline:

 

Data A_2300405201_GRP;

set A_2300405201;

by loan_number source_date_dte StatusNo NOTSORTED;

 

retain F_Stat_Day;

 

If first.StatusNo then Do

F_Stat_Day=Source_Date_dte;

  /*Output; */

end;

 

If last.StatusNo then Do

L_Stat_Day=Source_Date_dte;

Output;

end;

 

format F_Stat_Day mmddyy8. L_Stat_Day mmddyy8.;

run;

 

NOTES: 1) I havn't run the code. 

       2) Did you changed F_stat_day to F_Day ?  L_Stat_day to L_Day ?
          or did you added labels to display on output ?
          (Your result doesn't fit your code variable names).

Super User
Posts: 10,046

Re: Find First and Last Dates in a dataset with repeating groups

Posted in reply to Kody_devl
If I understand what you mean.


data have;
infile cards expandtabs truncover;
input Loan_Number : $20.	Source_Date_dte : mmddyy10.	StatusNo	Status  :$20.;
format Source_date_dte mmddyy10.;
cards;
XX00405201	4/6/2016	1	LOCKED
XX00405201	4/7/2016	1	LOCKED
XX00405201	4/8/2016	1	LOCKED
XX00405201	4/11/2016	1	LOCKED
XX00405201	4/12/2016	1	LOCKED
XX00405201	4/13/2016	1	LOCKED
XX00405201	4/14/2016	1	LOCKED
XX00405201	4/15/2016	1	LOCKED
XX00405201	4/18/2016	1	LOCKED
XX00405201	4/19/2016	1	LOCKED
XX00405201	4/20/2016	1	LOCKED
XX00405201	4/21/2016	1	LOCKED
XX00405201	4/22/2016	1	LOCKED
XX00405201	4/25/2016	1	LOCKED
XX00405201	4/26/2016	1	LOCKED
XX00405201	4/27/2016	1	LOCKED
XX00405201	4/28/2016	1	LOCKED
XX00405201	4/29/2016	1	LOCKED
XX00405201	5/2/2016	1	LOCKED
XX00405201	5/3/2016	1	LOCKED
XX00405201	5/4/2016	1	LOCKED
XX00405201	5/5/2016	1	LOCKED
XX00405201	5/6/2016	1	LOCKED
XX00405201	5/9/2016	1	LOCKED
XX00405201	5/10/2016	1	LOCKED
XX00405201	5/11/2016	2	APPRVD
XX00405201	5/12/2016	2	APPRVD
XX00405201	5/13/2016	2	APPRVD
XX00405201	5/16/2016	2	APPRVD
XX00405201	5/17/2016	2	APPRVD
XX00405201	5/18/2016	2	APPRVD
XX00405201	5/19/2016	2	APPRVD
XX00405201	5/20/2016	3	OTC
XX00405201	5/23/2016	4	CLOSED
XX00405201	5/24/2016	4	CLOSED
XX00405201	5/25/2016	4	CLOSED
XX00405201	5/26/2016	4	CLOSED
XX00405201	5/27/2016	4	CLOSED
XX00405201	5/28/2016	4	CLOSED
XX00405201	5/31/2016	5	READY
XX00405201	6/2/2016	5	READY
XX00405201	6/3/2016	5	READY
XX00405201	6/6/2016	5	READY
XX00405201	6/7/2016	5	READY
XX00405201	6/8/2016	5	READY
XX00405201	6/9/2016	5	READY
XX00405201	6/10/2016	5	READY
XX00405201	6/13/2016	5	READY
XX00405201	6/14/2016	5	READY
XX00405201	6/15/2016	5	READY
XX00405201	6/16/2016	5	READY
XX00405201	6/17/2016	5	READY
XX00405201	6/20/2016	5	READY
XX00405201	6/21/2016	5	READY
XX00405201	6/22/2016	5	READY
XX00405201	6/23/2016	5	READY
XX00405201	6/24/2016	5	READY
XX00405201	6/27/2016	5	READY
XX00405201	6/28/2016	5	READY
XX00405201	6/29/2016	5	READY
XX00405201	6/30/2016	6	POOLED
XX00405201	7/1/2016	4	CLOSED
XX00405201	7/2/2016	4	CLOSED
XX00405201	7/6/2016	4	CLOSED
XX00405201	7/7/2016	4	CLOSED
XX00405201	7/8/2016	4	CLOSED
XX00405201	7/11/2016	4	CLOSED
XX00405201	7/12/2016	4	CLOSED
XX00405201	7/13/2016	4	CLOSED
XX00405201	7/14/2016	4	CLOSED
XX00405201	7/15/2016	4	CLOSED
XX00405201	7/18/2016	4	CLOSED
XX00405201	7/19/2016	4	CLOSED
XX00405201	7/20/2016	4	CLOSED
XX00405201	7/21/2016	4	CLOSED
XX00405201	7/22/2016	4	CLOSED
XX00405201	7/25/2016	4	CLOSED
XX00405201	7/26/2016	4	CLOSED
XX00405201	7/27/2016	4	CLOSED
XX00405201	7/28/2016	4	CLOSED
XX00405201	7/29/2016	4	CLOSED
XX00405201	8/1/2016	4	CLOSED
XX00405201	8/2/2016	5	READY
XX00405201	8/3/2016	5	READY
XX00405201	8/4/2016	5	READY
XX00405201	8/5/2016	5	READY
XX00405201	8/8/2016	5	READY
XX00405201	8/9/2016	5	READY
XX00405201	8/10/2016	5	READY
XX00405201	8/11/2016	5	READY
XX00405201	8/12/2016	5	READY
XX00405201	8/15/2016	5	READY
XX00405201	8/16/2016	5	READY
XX00405201	8/17/2016	5	READY
XX00405201	8/18/2016	5	READY
XX00405201	8/19/2016	5	READY
XX00405201	8/22/2016	5	READY
XX00405201	8/23/2016	5	READY
XX00405201	8/24/2016	5	READY
XX00405201	8/25/2016	5	READY
XX00405201	8/26/2016	5	READY
XX00405201	8/27/2016	5	READY
XX00405201	8/29/2016	5	READY
XX00405201	8/30/2016	6	POOLED
;
run;
data want;
 set have;
 by Status notsorted;
 retain f_date ;
 if first.Status then f_date=Source_Date_dte;
 if last.Status then do;l_date=Source_Date_dte;output;end;
 format f_date l_date mmddyy10.;
 keep status f_date l_date;
run;



Trusted Advisor
Posts: 1,586

Re: Find First and Last Dates in a dataset with repeating groups

Do you want to calculate the F_DATE L_DATE per each loan ? or per the whole file ?
As mutch as I understand your input contains many loans.

 

I suggest to run the two versions:

 

1)  using BY sttaemant as I wrote:
     BY   loan_number    source_date_dte   StatusNo NOTSORTED;     

 

2) using BY StatusNo notsorted only

 

Compare the two outputs and define what fits your expectations better.

 

Ask a Question
Discussion stats
  • 5 replies
  • 250 views
  • 0 likes
  • 3 in conversation