BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nitink26
Obsidian | Level 7

 

Hi,

Please help -

In test data set I have id , dte_start and dte_end columns.

for ID 1- 

Customer Joined on 01JAN1961  and  left on 01JAN1962 and again joined on 01MAR1962. So here if the difference between dte_end (01JAN1962 )and latest  Dte_start (01MAR1962) is less than 90 days, so we keep Dte_start as initial Dte_start (01JAN1961). Dte_start only change if latest dte_start and previous Dte_end difference is more than 90 days.

DTE_end should be latest dte_end. For ID=1 DTE_end should be 31DEC9999.

OUTPUT FOR ID=1 -

id DTE_START DTE_END

1 01JAN1961 31DEC9999

 

data test;
input id : BEST8. DTE_START : DATE9. DTE_END : DATE9.;
FORMAT DTE_START DATE9. DTE_END DATE9.;
cards;
1 01JAN1961 01JAN1962
1 01MAR1962 01JAN1963
1 01FEB1963 31DEC9999
2 01JAN2001 01FEB2001
2 01MAR2001 31DEC9999
3 01JUL2011 30JUN2015
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018
;
run;

 

Output I am looking for -

id DTE_START DTE_END
1 01JAN1961 31DEC9999
2 01JAN2001 31DEC9999
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You want 1 record per id, having the last DTE_END, and the latest DTE_START following a gap of over 90 days.

 

data test;
input id : BEST8. DTE_START : DATE9. DTE_END : DATE9.;
FORMAT DTE_START DATE9. DTE_END DATE9.;
cards;
1 01JAN1961 01JAN1962
1 01MAR1962 01JAN1963
1 01FEB1963 31DEC9999
2 01JAN2001 01FEB2001
2 01MAR2001 31DEC9999
3 01JUL2011 30JUN2015
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018
;
run;

data want (drop=_:);
  set test;
  by id;
  retain _strt;
  _strt=ifn(first.id=1 or dte_start-90>lag(dte_end),dte_start,_strt);
  if last.id;
  dte_start=_strt;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
Ksharp
Super User

Assuming I understood what you mean.

 

 

data test;
input id : BEST8. DTE_START : DATE9. DTE_END : DATE9.;
FORMAT DTE_START DATE9. DTE_END DATE9.;
cards;
1 01JAN1961 01JAN1962
1 01MAR1962 01JAN1963
1 01FEB1963 31DEC9999
2 01JAN2001 01FEB2001
2 01MAR2001 31DEC9999
3 01JUL2011 30JUN2015
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018
;
run;

data temp;
 set test;
 by id;
 diff=DTE_START-lag(DTE_END) ;
 if first.id then diff=.;
run;
data temp2;
 set temp;
 by id;
 if first.id or diff>90 then group+1;
run;
proc sql;
create table want as
select id,min(DTE_START) as DTE_START format=date9.,max(DTE_END) as DTE_END format=date9.
 from (select * from temp2 group by id  having group=max(group))
  group by id;
quit;
yabwon
Amethyst | Level 16

According to data you provided I think id=3 should have 2 obs in result

data _null_;
  x = '30JUN2015'd - '01NOV2015'd;
  put x=;
run;

A simple approach would be something like that:

data test;
input id : BEST8. DTE_START : DATE9. DTE_END : DATE9.;
FORMAT DTE_START DATE9. DTE_END DATE9.;
cards;
1 01JAN1961 01JAN1962
1 01MAR1962 01JAN1963
1 01FEB1963 31DEC9999
2 01JAN2001 01FEB2001
2 01MAR2001 31DEC9999
3 01JUL2011 30JUN2015
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018
;
run;

data want1;
  set test;
  if abs(DTE_START - lag(DTE_END)) >= 90  then grp + 1;
run;
proc print;
run;

data want2;
  set want1;
  by id grp;

  if first.grp then s = DTE_START;
  if last.grp then output;

  retain s;
  drop DTE_START;
  rename 
    s = DTE_START
  ;
  format s DATE9.;
run;
proc print;
run;

 

Bart

 

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



nitink26
Obsidian | Level 7

Thanks for the reply and you are right but If the difference is more than 90 days then dte_start I wanna pick should be the latest dte_start as I am looking only one record per ID.

so for ID =3 output will be  -

id DTE_START DTE_END

3 01NOV2015 31DEC9999

If you could let me know how to get that would be highly appreciated. Thanks a ton!

mkeintz
PROC Star

You want 1 record per id, having the last DTE_END, and the latest DTE_START following a gap of over 90 days.

 

data test;
input id : BEST8. DTE_START : DATE9. DTE_END : DATE9.;
FORMAT DTE_START DATE9. DTE_END DATE9.;
cards;
1 01JAN1961 01JAN1962
1 01MAR1962 01JAN1963
1 01FEB1963 31DEC9999
2 01JAN2001 01FEB2001
2 01MAR2001 31DEC9999
3 01JUL2011 30JUN2015
3 01NOV2015 31DEC9999
4 01MAY2016 01MAY2018
;
run;

data want (drop=_:);
  set test;
  by id;
  retain _strt;
  _strt=ifn(first.id=1 or dte_start-90>lag(dte_end),dte_start,_strt);
  if last.id;
  dte_start=_strt;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1268 views
  • 0 likes
  • 4 in conversation