Hi, I have a data like the table below (also attached excel data). I need to create groupID by certain rules:
1. if locationID2 for previous row=locationID1 for the current row and if (intime for current row - outtime for previous <=1 ), then the rows are still put into the same group, e.g., groupID=1
2. Even if locationID2 for previous row not equal to locationID1 for the current row, as long as (intime for current row - outtime for previous <=1 ), then the rows are still put into the same group, e.g., groupID=2
3. Sometimes, there may be more than 2 cases in one group, e.g., groupID=3
4. Within the same person, groupID need to be in an increasing order
5. All other rows, groupID are assigned to be 0
Thanks for your help.
person | intime | locationID | outtime | locationID2 | groupID |
1 | 2/17/2009 | 1 | 2/25/2009 | 0 | 0 |
1 | 3/14/2009 | 1 | 3/26/2009 | 0 | 0 |
1 | 5/21/2009 | 1 | 5/25/2009 | 0 | 0 |
1 | 6/6/2009 | 1 | 6/10/2009 | 0 | 0 |
1 | 6/19/2009 | 1 | 6/22/2009 | 4 | 1 |
1 | 6/22/2009 | 4 | 7/3/2009 | 0 | 1 |
1 | 7/11/2009 | 1 | 7/12/2009 | 6 | 2 |
1 | 7/12/2009 | 7 | 7/17/2009 | 0 | 2 |
1 | 8/5/2009 | 1 | 8/10/2009 | 0 | 0 |
1 | 5/6/2011 | 1 | 5/12/2011 | 0 | 0 |
1 | 8/10/2011 | 1 | 8/13/2011 | 3 | 3 |
1 | 8/13/2011 | 3 | 8/15/2011 | 7 | 3 |
1 | 8/15/2011 | 7 | 8/24/2011 | 0 | 3 |
Very interesting question, quite a few mind-twisted turns, the following code seems to work for your presented data:
data have;
input person intime :mmddyy10. locationID outtime :mmddyy10. locationID2 ;
format intime outtime mmddyy10.;
cards;
1 2/17/2009 1 2/25/2009 0
1 3/14/2009 1 3/26/2009 0
1 5/21/2009 1 5/25/2009 0
1 6/6/2009 1 6/10/2009 0
1 6/19/2009 1 6/22/2009 4
1 6/22/2009 4 7/3/2009 0
1 7/11/2009 1 7/12/2009 6
1 7/12/2009 7 7/17/2009 0
1 8/5/2009 1 8/10/2009 0
1 5/6/2011 1 5/12/2011 0
1 8/10/2011 1 8/13/2011 3
1 8/13/2011 3 8/15/2011 7
1 8/15/2011 7 8/24/2011 0
;
data want;
set have;by person;
set have(firstobs=2 rename=(intime=_intime locationid=_locationid)
keep=locationid intime)
have(obs=1 drop=_all_);
retain group 0;
if first.person then _group=0;
if last.person then call missing(of _:);
_a=(_intime-outtime <= 1 and _locationid - locationid2 <=1);
_b=(intime-lag(outtime) <=1 and locationid-lag(locationid2) <=1);
if _a and not _b then do;_group+1; group=_group;end;
if not _a and not _b then group=0;
drop _:;
run;
Haikuo
Very interesting question, quite a few mind-twisted turns, the following code seems to work for your presented data:
data have;
input person intime :mmddyy10. locationID outtime :mmddyy10. locationID2 ;
format intime outtime mmddyy10.;
cards;
1 2/17/2009 1 2/25/2009 0
1 3/14/2009 1 3/26/2009 0
1 5/21/2009 1 5/25/2009 0
1 6/6/2009 1 6/10/2009 0
1 6/19/2009 1 6/22/2009 4
1 6/22/2009 4 7/3/2009 0
1 7/11/2009 1 7/12/2009 6
1 7/12/2009 7 7/17/2009 0
1 8/5/2009 1 8/10/2009 0
1 5/6/2011 1 5/12/2011 0
1 8/10/2011 1 8/13/2011 3
1 8/13/2011 3 8/15/2011 7
1 8/15/2011 7 8/24/2011 0
;
data want;
set have;by person;
set have(firstobs=2 rename=(intime=_intime locationid=_locationid)
keep=locationid intime)
have(obs=1 drop=_all_);
retain group 0;
if first.person then _group=0;
if last.person then call missing(of _:);
_a=(_intime-outtime <= 1 and _locationid - locationid2 <=1);
_b=(intime-lag(outtime) <=1 and locationid-lag(locationid2) <=1);
if _a and not _b then do;_group+1; group=_group;end;
if not _a and not _b then group=0;
drop _:;
run;
Haikuo
Hi Hai Kuo,
Thank you so much, it works perfectly!
I really appreciate your help.
Have a nice day.
Huijuan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.