Hi, I have a data like the table below (also attached excel data). I need to create groupID by certain rules:
1. if locationID2 for previous row=locationID1 for the current row and if (intime for current row - outtime for previous <=1 ), then the rows are still put into the same group, e.g., groupID=1
2. Even if locationID2 for previous row not equal to locationID1 for the current row, as long as (intime for current row - outtime for previous <=1 ), then the rows are still put into the same group, e.g., groupID=2
3. Sometimes, there may be more than 2 cases in one group, e.g., groupID=3
4. Within the same person, groupID need to be in an increasing order
5. All other rows, groupID are assigned to be 0
Thanks for your help.
person | intime | locationID | outtime | locationID2 | groupID |
1 | 2/17/2009 | 1 | 2/25/2009 | 0 | 0 |
1 | 3/14/2009 | 1 | 3/26/2009 | 0 | 0 |
1 | 5/21/2009 | 1 | 5/25/2009 | 0 | 0 |
1 | 6/6/2009 | 1 | 6/10/2009 | 0 | 0 |
1 | 6/19/2009 | 1 | 6/22/2009 | 4 | 1 |
1 | 6/22/2009 | 4 | 7/3/2009 | 0 | 1 |
1 | 7/11/2009 | 1 | 7/12/2009 | 6 | 2 |
1 | 7/12/2009 | 7 | 7/17/2009 | 0 | 2 |
1 | 8/5/2009 | 1 | 8/10/2009 | 0 | 0 |
1 | 5/6/2011 | 1 | 5/12/2011 | 0 | 0 |
1 | 8/10/2011 | 1 | 8/13/2011 | 3 | 3 |
1 | 8/13/2011 | 3 | 8/15/2011 | 7 | 3 |
1 | 8/15/2011 | 7 | 8/24/2011 | 0 | 3 |
Very interesting question, quite a few mind-twisted turns, the following code seems to work for your presented data:
data have;
input person intime :mmddyy10. locationID outtime :mmddyy10. locationID2 ;
format intime outtime mmddyy10.;
cards;
1 2/17/2009 1 2/25/2009 0
1 3/14/2009 1 3/26/2009 0
1 5/21/2009 1 5/25/2009 0
1 6/6/2009 1 6/10/2009 0
1 6/19/2009 1 6/22/2009 4
1 6/22/2009 4 7/3/2009 0
1 7/11/2009 1 7/12/2009 6
1 7/12/2009 7 7/17/2009 0
1 8/5/2009 1 8/10/2009 0
1 5/6/2011 1 5/12/2011 0
1 8/10/2011 1 8/13/2011 3
1 8/13/2011 3 8/15/2011 7
1 8/15/2011 7 8/24/2011 0
;
data want;
set have;by person;
set have(firstobs=2 rename=(intime=_intime locationid=_locationid)
keep=locationid intime)
have(obs=1 drop=_all_);
retain group 0;
if first.person then _group=0;
if last.person then call missing(of _:);
_a=(_intime-outtime <= 1 and _locationid - locationid2 <=1);
_b=(intime-lag(outtime) <=1 and locationid-lag(locationid2) <=1);
if _a and not _b then do;_group+1; group=_group;end;
if not _a and not _b then group=0;
drop _:;
run;
Haikuo
Very interesting question, quite a few mind-twisted turns, the following code seems to work for your presented data:
data have;
input person intime :mmddyy10. locationID outtime :mmddyy10. locationID2 ;
format intime outtime mmddyy10.;
cards;
1 2/17/2009 1 2/25/2009 0
1 3/14/2009 1 3/26/2009 0
1 5/21/2009 1 5/25/2009 0
1 6/6/2009 1 6/10/2009 0
1 6/19/2009 1 6/22/2009 4
1 6/22/2009 4 7/3/2009 0
1 7/11/2009 1 7/12/2009 6
1 7/12/2009 7 7/17/2009 0
1 8/5/2009 1 8/10/2009 0
1 5/6/2011 1 5/12/2011 0
1 8/10/2011 1 8/13/2011 3
1 8/13/2011 3 8/15/2011 7
1 8/15/2011 7 8/24/2011 0
;
data want;
set have;by person;
set have(firstobs=2 rename=(intime=_intime locationid=_locationid)
keep=locationid intime)
have(obs=1 drop=_all_);
retain group 0;
if first.person then _group=0;
if last.person then call missing(of _:);
_a=(_intime-outtime <= 1 and _locationid - locationid2 <=1);
_b=(intime-lag(outtime) <=1 and locationid-lag(locationid2) <=1);
if _a and not _b then do;_group+1; group=_group;end;
if not _a and not _b then group=0;
drop _:;
run;
Haikuo
Hi Hai Kuo,
Thank you so much, it works perfectly!
I really appreciate your help.
Have a nice day.
Huijuan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.