Hi friends,
I'm running into a problem working with records that involve consecutive check in and check out dates. I have several lines per individual "stay." I want to select the first check in date and the last check out date and overwrite the rest of the dates for my analysis. I do not want to collapse/deduplicate the lines--I still want the same number of rows, but just the full date period. I have already written code to select the first check in, but can't figure out how to overwrite with the last check out.
Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 25-Aug-13 |
West | Kanye | 24-Aug-13 | 26-Aug-13 |
West | Kanye | 24-Aug-13 | 29-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
This is what I want it to look like:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
Can anybody help me out??
Hopefully!
Whatever your 'grouping' variables are, add them to the GROUP statement.
That's a bit of a weird request but ok...
SQL works well here. Use the Select * to select all level records and then use the max/min to get the earliest/latest dates in your dataset.
proc sql;
create table want as
select *, min(check_in) as check_in_start, max(check_out) as check_out_end
from have
group by lastName, FirstName
order by lastName, FirstName, check_in;
quit;
Oh! This is really close to what I am looking for, however there is one problem (my apologies, I should have considered this before!).
If the same person has a new stay (involving new check in date), I would want to retain those visits as separate. Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 25-Aug-13 |
Bieber | Justin | 24-Aug-13 | 26-Aug-13 |
Bieber | Justin | 24-Aug-13 | 29-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
To this:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
I should note that I also have unique id numbers for these individuals.
P.S. Thank you for your high tolerance for the obsurity of my request!
Whatever your 'grouping' variables are, add them to the GROUP statement.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.