BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cdubs
Quartz | Level 8

HI all,

 

I currently have a dataset that has

 

data have; 
input enrolid year enroll_start_dt enroll_end_dt; /* Don't know how to format the dates -- any advice?? */

datalines;
1234 2010 01-JAN-2010 31-DEC-2010 
1234 2011 01-JAN-2011 7-SEP-2011
5678 2010 01-JAN-2010 8-AUG-2010
4829 2010 01-JAN-2010 31-DEC-2010
4829 2011 01-JAN-2011 31-DEC-2011
4829 2012 01-JAN-2012 31-DEC-2012
4829 2013 01-JAN-2013 31-DEC-2013
;

 

 

And I want to get this:

 

data want; 
input enrolid year enroll_start_dt enroll_end_dt overall_start overall_end overall_days;
datalines;
1234 2010 01-JAN-2010 31-DEC-2010 01-JAN-2010 7-SEP-2011 614
1234 2011 01-JAN-2011 7-SEP-2011 01-JAN-2010 7-SEP-2011 614
5678 2010 01-JAN-2010 8-AUG-2010 01-JAN-2010 8-AUG-2010 219
4829 2010 01-JAN-2010 31-DEC-2010 01-JAN-2010 31-DEC-2013 1095
4829 2011 01-JAN-2011 31-DEC-2011 01-JAN-2010 31-DEC-2013 1095 
4829 2012 01-JAN-2012 31-DEC-2012 01-JAN-2010 31-DEC-2013 1095
4829 2013 01-JAN-2013 31-DEC-2013 01-JAN-2010 31-DEC-2013 1095
;

But with the current code I'm running I'm getting something funky ... that looks almost like the above but I am not able to retain all of the original observations corresponding to each unique enrolid - enroll_start_dt - enroll_end_dt combination...

  

My current code is as follows:

 

data want;
			set have;
			by enrolid enroll_start_dt enroll_end_dt;
			retain enroll_start_dt enroll_end_dt overall_start overall_end overall_days;
			format overall_start overall_end date11.;
			if first.enrolid and first.enroll_start_dt and first.enroll_end_dt then do;
				overall_start=enroll_start_dt;
				overall_end=enroll_end_dt;
				overall_days=overall_end-overall_start;
			end;
			
				else if enroll_start_dt>overall_end+1 then do;
				output;
				overall_start=enroll_start_dt;
				overall_end=enroll_end_dt;
				overall_days=overall_end-overall_start;
			end;

else if enroll_start_dt<=overall_end+1 then do;
overall_start=min(overall_start,enroll_start_dt);
overall_end = max(overall_end, enroll_end_dt);
overall_days=overall_end-overall_start;
end; if last.enrolid then output; run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@cdubs

Here a solution approach using SAS SQL flavor:

data have;
  input enrolid year (enroll_start_dt enroll_end_dt) (:date11.); 
  format enroll_start_dt enroll_end_dt date11.;
  datalines;
1234 2010 01-JAN-2010 31-DEC-2010 
1234 2011 01-JAN-2011 7-SEP-2011
5678 2010 01-JAN-2010 8-AUG-2010
4829 2010 01-JAN-2010 31-DEC-2010
4829 2011 01-JAN-2011 31-DEC-2011
4829 2012 01-JAN-2012 31-DEC-2012
4829 2013 01-JAN-2013 31-DEC-2013
;
run;

proc sql;
  create table want as
    select 
      *
      ,min(enroll_start_dt) as overall_start format=date11.
      ,max(enroll_end_dt)   as overall_end   format=date11.
      ,(calculated overall_end - calculated overall_start) as overall_days
    from 
      have
    group by enrolid
  ;
run;

 

View solution in original post

2 REPLIES 2
novinosrin
Tourmaline | Level 20

data have;
input enrolid year (enroll_start_dt enroll_end_dt) (:date9.); 
format enroll_start_dt enroll_end_dt date9.;
datalines;
1234 2010 01-JAN-2010 31-DEC-2010
1234 2011 01-JAN-2011 7-SEP-2011
5678 2010 01-JAN-2010 8-AUG-2010
4829 2010 01-JAN-2010 31-DEC-2010
4829 2011 01-JAN-2011 31-DEC-2011
4829 2012 01-JAN-2012 31-DEC-2012
4829 2013 01-JAN-2013 31-DEC-2013
;

data want;
do until(last.enrolid);
set have;
by enrolid notsorted;
if first.enrolid then overall_start =enroll_start_dt;
if last.enrolid then do;
overall_end =enroll_end_dt;
overall_days=intck('days', overall_start,overall_end);
end;
end;
do until(last.enrolid);
set have;
by enrolid notsorted;
output;
end;
format overall_start overall_end date9.;
run;

Patrick
Opal | Level 21

@cdubs

Here a solution approach using SAS SQL flavor:

data have;
  input enrolid year (enroll_start_dt enroll_end_dt) (:date11.); 
  format enroll_start_dt enroll_end_dt date11.;
  datalines;
1234 2010 01-JAN-2010 31-DEC-2010 
1234 2011 01-JAN-2011 7-SEP-2011
5678 2010 01-JAN-2010 8-AUG-2010
4829 2010 01-JAN-2010 31-DEC-2010
4829 2011 01-JAN-2011 31-DEC-2011
4829 2012 01-JAN-2012 31-DEC-2012
4829 2013 01-JAN-2013 31-DEC-2013
;
run;

proc sql;
  create table want as
    select 
      *
      ,min(enroll_start_dt) as overall_start format=date11.
      ,max(enroll_end_dt)   as overall_end   format=date11.
      ,(calculated overall_end - calculated overall_start) as overall_days
    from 
      have
    group by enrolid
  ;
run;

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1163 views
  • 1 like
  • 3 in conversation