BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ifti_ch2002
Obsidian | Level 7

Hi,

I have a question. I need to followup an object until 3 year period with a condition that the object has not been sold i last 5 year.

Example:

Object    date              value

1           01-01-2003    --

1          01-01-2004    --

1          01-02-2009     1

1          01-01-2010     2

1          01-01-2016     --

1          01-01-2017    --

 

what i need is first to check if there is a difference between the two dates within the same object is more than 5 years, if it is true then followup the same object upto 3 year. ie the difference between the third and second is more than five years, so it will be our index date and it shall be followed upto 3 years.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@ifti_ch2002 wrote:

...

1) f=1 if the difference between the current date and lag_date (previous) is greater than 5 years

2) f=1 will be the index date

3)f+1 if the difference between the following date and index date is less than 3 years


To check the first thing you need to get the change in DATE between observations.  LAG() will help.

For the second one I am going jump to the conclusion that you meant to say that when there is a jump of over 5 years you want to set F to 1 AND also remember what DATE was at that time.

For the third one I am going to jump to another conclusion that you want to keep incrementing F as long as the difference from the saved index date is <= 3 years.  AND that when it is not then you set F to missing.

 

So let's setup your sample data with your expected values as WANT (instead of F).

data have;
  infile datalines truncover;
  input IDnumber date  :mmddyy10. want;
  format date yymmdd10.;
datalines;
1 4-7-2003 .
1 4-7-2009 1
1 4-7-2010 2
1 4-7-2011 3
1 4-7-2012 4
1 4-7-2013 .
1 4-7-2014 .
1 4-7-2015 .
1 4-7-2016 .
1 4-7-2018 .
2 4-7-2005 .
2 4-7-2011 1
2 4-7-2012 2
2 4-7-2013 3
2 4-7-2014 4
2 4-7-2015 .
2 4-7-2016 .
2 4-7-2017 .
2 4-7-2018 .
2 4-7-2019 .
2 4-7-2020 .
;

So it is pretty straight forward to code once the rules are clear. Remember to call the LAG() function on every observation so it sees every value.  But don't use the resuliting difference when you are starting a new group.


data want;
  set have;
  by idnumber date ;
  yrdif = intck('year',lag(date),date,'cont');
  if first.idnumber then call missing(yrdif,index_date,f);
  if f then do;
    if intck('year',index_date,date,'cont') <= 3 then f+1;
    else f=.;
  end;
  else if yrdif >= 5 then do;
    index_date=date;
    f=1;
  end;
  format index_date yymmdd10.;
  retain index_date f;
run;

Results:

Obs    IDnumber          date    want    yrdif    index_date    f

  1        1       2003-04-07      .       .               .    .
  2        1       2009-04-07      1       6      2009-04-07    1
  3        1       2010-04-07      2       1      2009-04-07    2
  4        1       2011-04-07      3       1      2009-04-07    3
  5        1       2012-04-07      4       1      2009-04-07    4
  6        1       2013-04-07      .       1      2009-04-07    .
  7        1       2014-04-07      .       1      2009-04-07    .
  8        1       2015-04-07      .       1      2009-04-07    .
  9        1       2016-04-07      .       1      2009-04-07    .
 10        1       2018-04-07      .       2      2009-04-07    .
 11        2       2005-04-07      .       .               .    .
 12        2       2011-04-07      1       6      2011-04-07    1
 13        2       2012-04-07      2       1      2011-04-07    2
 14        2       2013-04-07      3       1      2011-04-07    3
 15        2       2014-04-07      4       1      2011-04-07    4
 16        2       2015-04-07      .       1      2011-04-07    .
 17        2       2016-04-07      .       1      2011-04-07    .
 18        2       2017-04-07      .       1      2011-04-07    .
 19        2       2018-04-07      .       1      2011-04-07    .
 20        2       2019-04-07      .       1      2011-04-07    .
 21        2       2020-04-07      .       1      2011-04-07    .

View solution in original post

11 REPLIES 11
PhilC
Rhodochrosite | Level 12

for Base SAS, a Do Whitlock loop can be written to iterate through this, giving respect to a by statement.


So many resources -- first link from Google:
https://support.sas.com/resources/papers/proceedings12/052-2012.pdf

andreas_lds
Jade | Level 19
Please post what you expect as result.
ifti_ch2002
Obsidian | Level 7
The result is in the value column.
PhilC
Rhodochrosite | Level 12
@ifti_ch2002

Perhaps you can show more input.
ifti_ch2002
Obsidian | Level 7
data test;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10. @12 f;
   datalines;  
1 4-7-2003 
1 4-7-2009 1
1 4-7-2010 2
1 4-7-2011 3
1 4-7-2012 4
1 4-7-2013 
1 4-7-2014 
1 4-7-2015 
1 4-7-2016 
1 4-7-2018 
2 4-7-2005 
2 4-7-2011 1
2 4-7-2012 2
2 4-7-2013 3
2 4-7-2014 4
2 4-7-2015 
2 4-7-2016 
2 4-7-2017 
2 4-7-2018 
2 4-7-2019 
2 4-7-2020 
;       
run; 

data test1;
set test;
format date lag_date mmddyy10.;
run;

1) f=1 if the difference between the current date and lag_date (previous) is greater than 5 years

2) f=1 will be the index date

3)f+1 if the difference between the following date and index date is less than 3 years

PhilC
Rhodochrosite | Level 12
Thanks, I understand what you are describing much better. Are you familiar with INTCK and INTNX? Great functions. I see you using INTNX to count year "intervals".

If f=1 then date_3yrs_after_index = INTNX("year",date,3,"Same");
andreas_lds
Jade | Level 19

@PhilC: I think that the variable "f" is the requested result and not present in the data.

@ifti_ch2002 : Please specify clearly what you have and want you want, it is quite uncommon to present one dataset containing both at the same time. And you wrote "3)f+1 if the difference between the following date and index date is less than 3 years" - with "following date" the date of the observation is meant in which f is set to f+1? And "less than" seems to be wrong, because the difference between index-Date and current date in the fifth observation is three and not less than three.

 

Some ideas:

  • retain f, lastDate and indexDate
  • use by IdNumber and reset f and lastDate if first.IdNumber

 

ifti_ch2002
Obsidian | Level 7

@andreas_lds sorry for that. I should be more specifically define what i have and what i need.

f column is what i required.

Following date i mean where the value of f+1 will be.

And i mean less than or euql to 3 years

PhilC
Rhodochrosite | Level 12
&gt;&gt;@PhilC: I think that the variable "f" is the requested result and not present in the data

Indeed, but once f is set to one.

@ifti ? What do think about the Do Whitlock loop? Was that helpful? If so, how have you tried using it?

Do you program primarily using data steps or by SQL? Andreas' point of having a "have" and "want" data set helps us when you show us what you have coded. Give us your latest try at it.
Tom
Super User Tom
Super User

@ifti_ch2002 wrote:

...

1) f=1 if the difference between the current date and lag_date (previous) is greater than 5 years

2) f=1 will be the index date

3)f+1 if the difference between the following date and index date is less than 3 years


To check the first thing you need to get the change in DATE between observations.  LAG() will help.

For the second one I am going jump to the conclusion that you meant to say that when there is a jump of over 5 years you want to set F to 1 AND also remember what DATE was at that time.

For the third one I am going to jump to another conclusion that you want to keep incrementing F as long as the difference from the saved index date is <= 3 years.  AND that when it is not then you set F to missing.

 

So let's setup your sample data with your expected values as WANT (instead of F).

data have;
  infile datalines truncover;
  input IDnumber date  :mmddyy10. want;
  format date yymmdd10.;
datalines;
1 4-7-2003 .
1 4-7-2009 1
1 4-7-2010 2
1 4-7-2011 3
1 4-7-2012 4
1 4-7-2013 .
1 4-7-2014 .
1 4-7-2015 .
1 4-7-2016 .
1 4-7-2018 .
2 4-7-2005 .
2 4-7-2011 1
2 4-7-2012 2
2 4-7-2013 3
2 4-7-2014 4
2 4-7-2015 .
2 4-7-2016 .
2 4-7-2017 .
2 4-7-2018 .
2 4-7-2019 .
2 4-7-2020 .
;

So it is pretty straight forward to code once the rules are clear. Remember to call the LAG() function on every observation so it sees every value.  But don't use the resuliting difference when you are starting a new group.


data want;
  set have;
  by idnumber date ;
  yrdif = intck('year',lag(date),date,'cont');
  if first.idnumber then call missing(yrdif,index_date,f);
  if f then do;
    if intck('year',index_date,date,'cont') <= 3 then f+1;
    else f=.;
  end;
  else if yrdif >= 5 then do;
    index_date=date;
    f=1;
  end;
  format index_date yymmdd10.;
  retain index_date f;
run;

Results:

Obs    IDnumber          date    want    yrdif    index_date    f

  1        1       2003-04-07      .       .               .    .
  2        1       2009-04-07      1       6      2009-04-07    1
  3        1       2010-04-07      2       1      2009-04-07    2
  4        1       2011-04-07      3       1      2009-04-07    3
  5        1       2012-04-07      4       1      2009-04-07    4
  6        1       2013-04-07      .       1      2009-04-07    .
  7        1       2014-04-07      .       1      2009-04-07    .
  8        1       2015-04-07      .       1      2009-04-07    .
  9        1       2016-04-07      .       1      2009-04-07    .
 10        1       2018-04-07      .       2      2009-04-07    .
 11        2       2005-04-07      .       .               .    .
 12        2       2011-04-07      1       6      2011-04-07    1
 13        2       2012-04-07      2       1      2011-04-07    2
 14        2       2013-04-07      3       1      2011-04-07    3
 15        2       2014-04-07      4       1      2011-04-07    4
 16        2       2015-04-07      .       1      2011-04-07    .
 17        2       2016-04-07      .       1      2011-04-07    .
 18        2       2017-04-07      .       1      2011-04-07    .
 19        2       2018-04-07      .       1      2011-04-07    .
 20        2       2019-04-07      .       1      2011-04-07    .
 21        2       2020-04-07      .       1      2011-04-07    .
ifti_ch2002
Obsidian | Level 7
Thanks a lot, i really appriciate.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1446 views
  • 1 like
  • 4 in conversation