Hi All,
I have the below scenario,
Input:
ID | utp | start date | end date |
123 | 0 | 1/1/2021 | 31/1/2021 |
123 | 0 | 1/2/2021 | 10/3/2021 |
123 | 0 | 11/3/2021 | 30/4/2021 |
123 | 1 | 1/5/2021 | 24/5/2021 |
123 | 1 | 25/5/2021 | 4/6/2021 |
123 | 1 | 5/6/2021 | 20/6/2021 |
123 | 0 | 21/6/2021 | 1/7/2021 |
123 | 0 | 2/7/2021 | 9/7/2021 |
123 | 0 | 10/7/2021 | 21/7/2021 |
123 | 0 | 22/7/2021 |
14/8/2021 |
Output:
ID | utp | start date | end date |
123 | 0 | 1/1/2021 | 30/4/2021 |
123 | 1 | 1/5/2021 | 20/6/2021 |
123 | 0 | 21/6/2021 | 14/8/2021 |
Can anyone please suggest an approach to retreive the above output based on the prescribed input?
Any suggestions are highly appreciable!!
Thanks in advance!!
Use the NOTSORTED option on the BY statement.
data want;
set have;
retain start;
by id utp NOTSORTED;
if first.utp then Start = start_date;
if last.utp then do;
end = end_date;
output;
end;
format start end date9.;
drop start_date end_date;
run;
Please explain the logic that allows you to derive the output data from the input data.
So you want the minimum of the start_date and the maximum of the end_date for each grouping of the variables ID and UTP?
Or do you want always the first row start_date and the last row end_date for each grouping of the variables ID and UTP?
Use the NOTSORTED option on the BY statement.
data want;
set have;
retain start;
by id utp NOTSORTED;
if first.utp then Start = start_date;
if last.utp then do;
end = end_date;
output;
end;
format start end date9.;
drop start_date end_date;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.