Hello everyone,
I got stuck with the following query, can you please check and do the needful. Thanks in advance!
This is what I have
Subject | Category | Start | End |
101 | Period1 | 7-Aug-24 | 24-Aug-24 |
101 | Period2 | 12-Aug-24 | 14-Aug-24 |
101 | Period3 | 25-Aug-24 | 27-Nov-24 |
101 | Period2 | 27-Aug-24 | 27-Aug-24 |
101 | Period4 | 11-Sep-24 | 27-Nov-24 |
101 | Period2 | 20-Sep-24 | 20-Sep-24 |
101 | Period2 | 13-Oct-24 | 16-Oct-24 |
101 | Period2 | 20-Oct-24 | 22-Oct-24 |
101 | Period2 | 7-Nov-24 | 8-Nov-24 |
I want the result in the below fashion. Please do the needful
Subject | Category | Start | End | |
101 | Period1 | 7-Aug-24 | 11-Aug-24 | |
101 | Period2 | 12-Aug-24 | 14-Aug-24 | |
101 | Period1 | 15-Aug-24 | 24-Aug-24 | Since this continued from 7Aug till 24Aug in between Period 2 occurred |
101 | Period3 | 25-Aug-24 | 26-Aug-24 | |
101 | PEriod2 | 27-Aug-24 | 27-Aug-24 | |
101 | Period4 | 11-Sep-24 | 19-Sep-24 | |
101 | Period2 | 20-Sep-24 | 20-Sep-24 | |
101 | Period4 | 21-Sep-24 | 12-Oct-24 | |
101 | Period2 | 13-Oct-24 | 16-Oct-24 | |
101 | Period4 | 17-Oct-24 | 19-Oct-24 | |
101 | Period2 | 20-Oct-24 | 22-Oct-24 | |
101 | Period4 | 23-Oct-24 | 6-Nov-24 | |
101 | Period2 | 7-Nov-24 | 8-Nov-24 | |
101 | Period4 | 9-Nov-24 | 27-Nov-24 | Period4 started 11-sep and ended 27Nov |
There are too many scenarioes you need to take into account of.
Anyway, here could give you a start.
data have;
infile cards expandtabs truncover;
input Subject Category $ Start :date11. End :date11.;
format Start End :date11.;
cards;
101 Period1 7-Aug-24 24-Aug-24
101 Period2 12-Aug-24 14-Aug-24
101 Period3 25-Aug-24 27-Nov-24
101 Period2 27-Aug-24 27-Aug-24
101 Period4 11-Sep-24 27-Nov-24
101 Period2 20-Sep-24 20-Sep-24
101 Period2 13-Oct-24 16-Oct-24
101 Period2 20-Oct-24 22-Oct-24
101 Period2 7-Nov-24 8-Nov-24
;
data temp;
set have;
do date=start to end;
output;
end;
drop start end;
format date date11.;
run;
proc sort data=temp;
by Subject date;
run;
data temp2;
set temp;
by Subject date;
if last.date;
run;
data temp3;
set temp2;
by Subject category notsorted;
if dif(date) ne 1 or first.category then group+1;
run;
proc sql;
create table want as
select group,Subject,category,min(date) as start format=date11.,max(date) as end format=date11.
from temp3
group by group,Subject,category;
quit;
There are too many scenarioes you need to take into account of.
Anyway, here could give you a start.
data have;
infile cards expandtabs truncover;
input Subject Category $ Start :date11. End :date11.;
format Start End :date11.;
cards;
101 Period1 7-Aug-24 24-Aug-24
101 Period2 12-Aug-24 14-Aug-24
101 Period3 25-Aug-24 27-Nov-24
101 Period2 27-Aug-24 27-Aug-24
101 Period4 11-Sep-24 27-Nov-24
101 Period2 20-Sep-24 20-Sep-24
101 Period2 13-Oct-24 16-Oct-24
101 Period2 20-Oct-24 22-Oct-24
101 Period2 7-Nov-24 8-Nov-24
;
data temp;
set have;
do date=start to end;
output;
end;
drop start end;
format date date11.;
run;
proc sort data=temp;
by Subject date;
run;
data temp2;
set temp;
by Subject date;
if last.date;
run;
data temp3;
set temp2;
by Subject category notsorted;
if dif(date) ne 1 or first.category then group+1;
run;
proc sql;
create table want as
select group,Subject,category,min(date) as start format=date11.,max(date) as end format=date11.
from temp3
group by group,Subject,category;
quit;
Thanks a lot @Ksharp amazing. it worked, I need to add a few more components to get the data as desired. Have a Good one,. Thanks again.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.