BookmarkSubscribeRSS Feed
Manhort
Obsidian | Level 7

Hi everyone,

 

I have data here with 3 ids. I need to create a new variable I'm calling 'clusterID' that goes through the original ids and creates new clusters based on the dates. Date groupings that are 60 days apart should be made a new cluster and counted off. In this example there will be a total of 6 clusterIDs. Person 100 has 1, person 161 has 2, and person 222 has 3. The data is messy and dates are repeated (there are other variables I can't just abandon).

 

 

DATA have;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
100 09/08/2016
100 09/08/2016
100 09/12/2016
161 06/14/2017
161 06/14/2017
161 06/19/2017
161 06/19/2017
161 08/14/2019
161 08/14/2019
161 08/22/2019
222 01/12/2016
222 05/10/2016
222 05/10/2016
222 05/10/2016
222 05/10/2016
222 03/03/2017
;
RUN;

DATA want;
input id date mmddyy10. clusterid;
format date mmddyy10.;
CARDS;
100 09/08/2016 1
100 09/08/2016 1
100 09/12/2016 1
161 06/14/2017 2
161 06/14/2017 2
161 06/19/2017 2
161 06/19/2017 2
161 08/14/2019 3
161 08/14/2019 3
161 08/22/2019 3
222 01/12/2016 4
222 05/10/2016 5
222 05/10/2016 5
222 05/10/2016 5
222 05/10/2016 5
222 03/03/2017 6
;
RUN;

 

 

What I've Tried:

Previously, someone helped me in another situation and gave me code that will separate the dates by 60days and keep just the first date, this is shown in the 'oldwant' table below. I'm unsure if this helps, but it does give start dates for clusters.

 

 

data oldwant (drop=_:);
  do until (_nxt_id^=id or date+60<_nxt_date);
    merge have
          have (firstobs=2 keep=id date rename=(id=_nxt_id date=_nxt_date));
    if _init_date=. then _init_date=date;
  end;
  date=_init_date;
run;

 

Manhort_0-1649195111862.png

 

 

Any help is appreciated thanks.

3 REPLIES 3
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT;
  set HAVE;
  by ID DATE;
  if ID=lag(ID) and DATE > lag(DATE)+60 then GROUP+1; 
  else if first.ID then GROUP+1;
run;
 
ID DATE GROUP
100 08SEP2016 1
100 08SEP2016 1
100 12SEP2016 1
161 14JUN2017 2
161 14JUN2017 2
161 19JUN2017 2
161 19JUN2017 2
161 14AUG2019 3
161 14AUG2019 3
161 22AUG2019 3
222 12JAN2016 4
222 10MAY2016 5
222 10MAY2016 5
222 10MAY2016 5
222 10MAY2016 5
222 03MAR2017 6

 

Ksharp
Super User
DATA have;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
100 09/08/2016
100 09/08/2016
100 09/12/2016
161 06/14/2017
161 06/14/2017
161 06/19/2017
161 06/19/2017
161 08/14/2019
161 08/14/2019
161 08/22/2019
222 01/12/2016
222 05/10/2016
222 05/10/2016
222 05/10/2016
222 05/10/2016
222 03/03/2017
;
RUN;

data want;
 set have;
 by id;
 retain start;
 if first.id or date-start>60 then do;start=date;clusterid+1;end;
 drop start;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1216 views
  • 3 likes
  • 4 in conversation