Hello all,
I have a large, long dataset with dated events by ID number. I'd like to create an enumeration variable for each event called Event_Number. However, I'd like the enumeration variable to group dates that are within 180 days of each other.
I have struggled all day to try to figure out a way to do this with no luck. Any help?
In the sample data (code below) you can see that ID 01 has two events within 180 days, but they are currently numbered as two separate events. I'd like both events for ID 01 to be enumerated as the same event ("1").
The same issue occurs again for ID 22 for events 5 and 6. I'd like both events 5 and 6 to be counted as one event ("5").
I read a post on creating sample data but I'm not sure if I've done that correctly:
Create table events(ID varchar(6), date float, event_number float);
Insert into events(ID, date, event_number) Values('01', 05/11/1960, 1);
Insert into events(ID, date, event_number) Values('01', 06/30/1960, 2);
Insert into events(ID, date, event_number) Values('04', 07/31/1995, 1);
Insert into events(ID, date, event_number) Values('22', 11/26/1969, 1);
Insert into events(ID, date, event_number) Values('22', 07/06/1981, 2);
Insert into events(ID, date, event_number) Values('22', 07/14/1986, 3);
Insert into events(ID, date, event_number) Values('22', 03/25/1992, 4);
Insert into events(ID, date, event_number) Values('22', 03/25/1992, 4);
Insert into events(ID, date, event_number) Values('22', 09/02/1993, 5);
Insert into events(ID, date, event_number) Values('22', 09/15/1993, 6);
Thanks for your help!
data have;
infile cards dlm="',";
input ID $ date :mmddyy12. event_number;
format date mmddyy10.;
cards;
'01', 05/11/1960, 1
'01', 06/30/1960, 2
'04', 07/31/1995, 1
'22', 11/26/1969, 1
'22', 07/06/1981, 2
'22', 07/14/1986, 3
'22', 03/25/1992, 4
'22', 09/02/1993, 5
'22', 09/15/1993, 6
;
data want;
set have;
by id;
retain temp ;
if first.id then do;want=1;temp=date;end;
if date-temp>180 then do;want+1;temp=date;end;
drop temp;
run;
data have;
infile cards dlm="',";
input ID $ date :mmddyy12. event_number;
format date mmddyy10.;
cards;
'01', 05/11/1960, 1
'01', 06/30/1960, 2
'04', 07/31/1995, 1
'22', 11/26/1969, 1
'22', 07/06/1981, 2
'22', 07/14/1986, 3
'22', 03/25/1992, 4
'22', 09/02/1993, 5
'22', 09/15/1993, 6
;
data want;
set have;
by id;
retain temp ;
if first.id then do;want=1;temp=date;end;
if date-temp>180 then do;want+1;temp=date;end;
drop temp;
run;
Brilliant, Ksharp!
However, I don't understand why the "want+1" works correctly if the "want" variable is not retained. I think (incorrectly?) that "want" will only have a value of 1 for first.ID and otherwise will be set to missing. So how does "want" equal 2 when the formula (in my mind) is "missing (.) + 1"? It seems like the value of "want" is being retained from the prior record but this variable has not been listed in the "retain" statement.
Can anyone help me understand?
@samp945 wrote:
Brilliant, Ksharp!
However, I don't understand why the "want+1" works correctly if the "want" variable is not retained. I think (incorrectly?) that "want" will only have a value of 1 for first.ID and otherwise will be set to missing. So how does "want" equal 2 when the formula (in my mind) is "missing (.) + 1"? It seems like the value of "want" is being retained from the prior record but this variable has not been listed in the "retain" statement.
Can anyone help me understand?
Your question is very observant. The statement
WANT+1;
is a SAS "sum statement".
It is effectively a replacement for
RETAIN WANT 0:
WANT=SUM(WANT,1);
See Sum Statement
You will see frequent use of sum statements in many responses in this community.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.