Hi everyone,
I'm a bit stumped as to how to approach this data manipulation problem, and I'm hoping the community can help. This is my first time posting, so my apologies if I'm doing this incorrectly. I currently have a dataset that looks like this:
group | date |
A | 2/12/2018 |
A | 7/18/2018 |
A | 8/23/2018 |
A | 10/4/2018 |
B | 5/10/2018 |
B | 9/4/2018 |
B | 12/18/2018 |
C | 3/19/2018 |
C | 6/2/2018 |
D | 3/12/2019 |
D | 4/25/2019 |
D | 5/23/2019 |
D | 6/19/2019 |
My goal is to take each date within a group and the subsequent date that follows to create two new "start" and "end" variables that indicate a date range. Here is my desired output:
group | start | end |
A | 2/12/2018 | 7/18/2018 |
A | 7/18/2018 | 8/23/2018 |
A | 8/23/2018 | 10/4/2018 |
B | 5/10/2018 | 9/4/2018 |
B | 9/4/2018 | 12/18/2018 |
C | 3/19/2018 | 6/2/2018 |
D | 3/12/2019 | 4/25/2019 |
D | 4/25/2019 | 5/23/2019 |
D | 5/23/2019 | 6/19/2019 |
Unfortunately, I do not have SAS code that I've tried, as I do not even know how to start to go about doing this. 😞 Any help is much appreciated.
data want;
set have;
by group; /* this lets you use first. and last. */
format start end mmddyy10.;
start = lag(date);
if not first.group
then do;
end = date;
output; /* first observation of a group will not be output */
end;
drop date;
run;
It usually helps if you post data as text rather than images. If we want to work with your data we'd have to type it out.
There's a somewhat neat trick you can use to get this (untested) because I'm too lazy to type out your data.
Merge BY group but set the first observation to be 2, so it gets the second row merged with the first.
Let me know how this works:
data want;
merge have (rename=date=start_date)
have(firstobs=2 rename=date=end_date);
by group;
if last.group then delete;
run;
@sasburger wrote:
Hi everyone,
I'm a bit stumped as to how to approach this data manipulation problem, and I'm hoping the community can help. This is my first time posting, so my apologies if I'm doing this incorrectly. I currently have a dataset that looks like this:
My goal is to take each date within a group and the subsequent date that follows to create two new "start" and "end" variables that indicate a date range. Here is my desired output:
Unfortunately, I do not have SAS code that I've tried, as I do not even know how to start to go about doing this. 😞 Any help is much appreciated.
data want;
set have;
by group; /* this lets you use first. and last. */
format start end mmddyy10.;
start = lag(date);
if not first.group
then do;
end = date;
output; /* first observation of a group will not be output */
end;
drop date;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.