Hi friends,
I'm running into a problem working with records that involve consecutive check in and check out dates. I have several lines per individual "stay." I want to select the first check in date and the last check out date and overwrite the rest of the dates for my analysis. I do not want to collapse/deduplicate the lines--I still want the same number of rows, but just the full date period. I have already written code to select the first check in, but can't figure out how to overwrite with the last check out.
Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 25-Aug-13 |
West | Kanye | 24-Aug-13 | 26-Aug-13 |
West | Kanye | 24-Aug-13 | 29-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
This is what I want it to look like:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Spears | Britney | 29-Mar-13 | 5-Apr-13 |
Ocean | Frank | 17-May-13 | 28-May-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
West | Kanye | 24-Aug-13 | 30-Aug-13 |
Can anybody help me out??
Hopefully!
Whatever your 'grouping' variables are, add them to the GROUP statement.
That's a bit of a weird request but ok...
SQL works well here. Use the Select * to select all level records and then use the max/min to get the earliest/latest dates in your dataset.
proc sql;
create table want as
select *, min(check_in) as check_in_start, max(check_out) as check_out_end
from have
group by lastName, FirstName
order by lastName, FirstName, check_in;
quit;
Oh! This is really close to what I am looking for, however there is one problem (my apologies, I should have considered this before!).
If the same person has a new stay (involving new check in date), I would want to retain those visits as separate. Example:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 22-Feb-13 |
Bieber | Justin | 14-Feb-13 | 16-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 25-Aug-13 |
Bieber | Justin | 24-Aug-13 | 26-Aug-13 |
Bieber | Justin | 24-Aug-13 | 29-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
To this:
LastName | FirstName | Check_in | Check_out |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Bieber | Justin | 14-Feb-13 | 21-Mar-13 |
Dion | Celine | 29-Mar-13 | 5-Apr-13 |
Houston | Whitney | 17-May-13 | 28-May-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
Bieber | Justin | 24-Aug-13 | 30-Aug-13 |
I should note that I also have unique id numbers for these individuals.
P.S. Thank you for your high tolerance for the obsurity of my request!
Whatever your 'grouping' variables are, add them to the GROUP statement.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.