BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
twildone
Pyrite | Level 9

Hi....I am trying to calculate the Minimum Start and Maximum End dates for each ID and Program. Because each course can be taken in different years, the group for the minimum and maximum is the same Year and if the Years are consecutive. Once there is a missing Year for the Program, then the minimum and maximum is calculates as a new group. What is the best way to set this up. Thanks. 

 

data Have;
length ID $3 Year $ 4 Program $ 15 Course $ 5 Start $ 10 End $ 10;
   infile datalines delimiter='' missover dsd; 
   input ID $ Year : $char4. Program : $char15. Course : $char5. Start : $char10. End : $char10.;
   datalines; 
100 2016 AutoMechanics A 2016-11-03 2017-04-20
100 2018 AutoMechanics B 2018-09-07 2019-04-20
100 2018 AutoMechanics C 2018-10-15 2019-06-25
100 2019 AutoMechanics D 2019-10-17 2020-01-20
100 2020 AutoMechanics E 2020-08-27 2020-09-15
;
run;

Want:

ID  Year Program       Course     Start      End       Min_Start    Max_End
100 2016 AutoMechanics   A     2016-11-03 2017-04-20  2016-11-03  2017-04-20
100 2018 AutoMechanics   B     2018-09-07 2019-04-20  2018-09-07  2020-09-15
100 2018 AutoMechanics   C     2018-10-15 2019-06-25  2018-09-07  2020-09-15
100 2019 AutoMechanics   D     2019-10-17 2020-01-20  2018-09-07  2020-09-15
100 2020 AutoMechanics   E     2020-08-27 2020-09-15  2018-09-07  2020-09-15
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You say "missing year" but do not show any "missing" values for year. Do you mean a year out of sequence?

 

This would likely be a lot easier if the year and dates were actually numeric and dates an not character values as then you could use something like Proc Summary with the dates after inserting a grouping variable.

 

This duplicates your desired output.

data Have;
length ID $3 Year $ 4 Program $ 15 Course $ 5 Start $ 10 End $ 10;
   infile datalines delimiter='' missover dsd; 
   input ID $ Year : $char4. Program : $char15. Course : $char5. Start : $char10. End : $char10.;
   datalines; 
100 2016 AutoMechanics A 2016-11-03 2017-04-20
100 2018 AutoMechanics B 2018-09-07 2019-04-20
100 2018 AutoMechanics C 2018-10-15 2019-06-25
100 2019 AutoMechanics D 2019-10-17 2020-01-20
100 2020 AutoMechanics E 2020-08-27 2020-09-15
;
run;

data need;
   set have;
   by id year;
   Retain ordergroup;
   yeardif = dif(year);
   if first.id then ordergroup=1;
   if yeardif>1 then ordergroup+1;
   drop yeardif;
run;

proc sql;
   create table want as
   select a.*,b.min_start,b.max_end
               
   from need as a left join
        (select id,ordergroup,min(start) as min_start, max(end) as max_end
                from need
                group by id, ordergroup
        ) as b
        on a.id=b.id
        and a.ordergroup=b.ordergroup
   ;
quit;

Examples with only one set of "groups" such as Id and Program are poor examples as I do not know if ID is unique for Program or not.

Also, bad things come to people who insist on storing dates as character values. None of the SAS functions or formats that work with dates will work with characters meaning that much work may be needed to get actual dates before most types of valid comparisons, intervals or grouping is done.

View solution in original post

2 REPLIES 2
ballardw
Super User

You say "missing year" but do not show any "missing" values for year. Do you mean a year out of sequence?

 

This would likely be a lot easier if the year and dates were actually numeric and dates an not character values as then you could use something like Proc Summary with the dates after inserting a grouping variable.

 

This duplicates your desired output.

data Have;
length ID $3 Year $ 4 Program $ 15 Course $ 5 Start $ 10 End $ 10;
   infile datalines delimiter='' missover dsd; 
   input ID $ Year : $char4. Program : $char15. Course : $char5. Start : $char10. End : $char10.;
   datalines; 
100 2016 AutoMechanics A 2016-11-03 2017-04-20
100 2018 AutoMechanics B 2018-09-07 2019-04-20
100 2018 AutoMechanics C 2018-10-15 2019-06-25
100 2019 AutoMechanics D 2019-10-17 2020-01-20
100 2020 AutoMechanics E 2020-08-27 2020-09-15
;
run;

data need;
   set have;
   by id year;
   Retain ordergroup;
   yeardif = dif(year);
   if first.id then ordergroup=1;
   if yeardif>1 then ordergroup+1;
   drop yeardif;
run;

proc sql;
   create table want as
   select a.*,b.min_start,b.max_end
               
   from need as a left join
        (select id,ordergroup,min(start) as min_start, max(end) as max_end
                from need
                group by id, ordergroup
        ) as b
        on a.id=b.id
        and a.ordergroup=b.ordergroup
   ;
quit;

Examples with only one set of "groups" such as Id and Program are poor examples as I do not know if ID is unique for Program or not.

Also, bad things come to people who insist on storing dates as character values. None of the SAS functions or formats that work with dates will work with characters meaning that much work may be needed to get actual dates before most types of valid comparisons, intervals or grouping is done.

twildone
Pyrite | Level 9

Hi BallardW…….Thanks for your help and your solution. It works great. Just to let you know that in the actual dataset the dates are stored in the $ymmdd10. format. When I was creating the dummy dataset, I tried using the :yymmdd10. format on the Input statement and was getting an error message so when I changed the format to $char10. the error message was gone. But once again thank you so much for your help....Greatly Appreciated!!..

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 317 views
  • 0 likes
  • 2 in conversation