Hi friends,
I'm running into a problem working with records that involve consecutive check in and check out dates. I have several lines per individual "stay." I want to select the first check in date and the last check out date and overwrite the rest of the dates for my analysis. I do not want to collapse/deduplicate the lines--I still want the same number of rows, but just the full date period. I have already written code to select the first check in, but can't figure out how to overwrite with the last check out.
Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 25-Aug-13 |
West | Kanye | 24-Aug-13 | 26-Aug-13 |
West | Kanye | 24-Aug-13 | 29-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
This is what I want it to look like:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
Can anybody help me out??
Hopefully!
Whatever your 'grouping' variables are, add them to the GROUP statement.
That's a bit of a weird request but ok...
SQL works well here. Use the Select * to select all level records and then use the max/min to get the earliest/latest dates in your dataset.
proc sql;
create table want as
select *, min(check_in) as check_in_start, max(check_out) as check_out_end
from have
group by lastName, FirstName
order by lastName, FirstName, check_in;
quit;
Oh! This is really close to what I am looking for, however there is one problem (my apologies, I should have considered this before!).
If the same person has a new stay (involving new check in date), I would want to retain those visits as separate. Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 25-Aug-13 |
Bieber | Justin | 24-Aug-13 | 26-Aug-13 |
Bieber | Justin | 24-Aug-13 | 29-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
To this:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
I should note that I also have unique id numbers for these individuals.
P.S. Thank you for your high tolerance for the obsurity of my request!
Whatever your 'grouping' variables are, add them to the GROUP statement.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.