BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
samp945
Obsidian | Level 7

Hello all,

 

I have a large, long dataset with dated  events by ID number.  I'd like to create an enumeration variable for each event called Event_Number. However, I'd like the enumeration variable to group dates that are within 180 days of each other.

 

I have struggled all day to try to figure out a way to do this with no luck. Any help?

 

In the sample data (code below) you can see that ID 01 has two events within 180 days, but they are currently numbered as two separate events. I'd like both events for ID 01 to be enumerated as the same event ("1").

The same issue occurs again for ID 22 for events 5 and 6. I'd like both events 5 and 6 to be counted as one event ("5").

 

I read a post on creating sample data but I'm not sure if I've done that correctly:

Create table events(ID varchar(6), date float, event_number float);
Insert into events(ID, date, event_number) Values('01', 05/11/1960, 1);
Insert into events(ID, date, event_number) Values('01', 06/30/1960, 2);
Insert into events(ID, date, event_number) Values('04', 07/31/1995, 1);
Insert into events(ID, date, event_number) Values('22', 11/26/1969, 1);
Insert into events(ID, date, event_number) Values('22', 07/06/1981, 2);
Insert into events(ID, date, event_number) Values('22', 07/14/1986, 3);
Insert into events(ID, date, event_number) Values('22', 03/25/1992, 4);
Insert into events(ID, date, event_number) Values('22', 03/25/1992, 4);
Insert into events(ID, date, event_number) Values('22', 09/02/1993, 5);
Insert into events(ID, date, event_number) Values('22', 09/15/1993, 6);

Thanks for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards dlm="',";
input ID $ date :mmddyy12. event_number;
format date mmddyy10.;
cards;
'01', 05/11/1960, 1
'01', 06/30/1960, 2
'04', 07/31/1995, 1
'22', 11/26/1969, 1
'22', 07/06/1981, 2
'22', 07/14/1986, 3
'22', 03/25/1992, 4
'22', 09/02/1993, 5
'22', 09/15/1993, 6
;

data want;
 set have;
 by id;
 retain temp ;
 if first.id then do;want=1;temp=date;end;
 if date-temp>180 then do;want+1;temp=date;end;
drop temp;
run;

View solution in original post

4 REPLIES 4
Ksharp
Super User
data have;
infile cards dlm="',";
input ID $ date :mmddyy12. event_number;
format date mmddyy10.;
cards;
'01', 05/11/1960, 1
'01', 06/30/1960, 2
'04', 07/31/1995, 1
'22', 11/26/1969, 1
'22', 07/06/1981, 2
'22', 07/14/1986, 3
'22', 03/25/1992, 4
'22', 09/02/1993, 5
'22', 09/15/1993, 6
;

data want;
 set have;
 by id;
 retain temp ;
 if first.id then do;want=1;temp=date;end;
 if date-temp>180 then do;want+1;temp=date;end;
drop temp;
run;
samp945
Obsidian | Level 7

Brilliant, Ksharp!

 

However, I don't understand why the "want+1" works correctly if the "want" variable is not retained. I think (incorrectly?) that "want" will only have a value of 1 for first.ID and otherwise will be set to missing. So how does "want" equal 2 when the formula (in my mind) is "missing (.) + 1"? It seems like the value of "want" is being retained from the prior record but this variable has not been listed in the "retain" statement.

 

Can anyone help me understand?

mkeintz
PROC Star

@samp945 wrote:

Brilliant, Ksharp!

 

However, I don't understand why the "want+1" works correctly if the "want" variable is not retained. I think (incorrectly?) that "want" will only have a value of 1 for first.ID and otherwise will be set to missing. So how does "want" equal 2 when the formula (in my mind) is "missing (.) + 1"? It seems like the value of "want" is being retained from the prior record but this variable has not been listed in the "retain" statement.

 

Can anyone help me understand?

 


Your question is very observant.  The statement   

 

WANT+1;

is a SAS "sum statement".

 

It is effectively a replacement for

RETAIN WANT 0:
WANT=SUM(WANT,1);

See Sum Statement 

 

You will see frequent use of sum statements in many responses in this community.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Yes. variable WANT is retained due to SUM Statement "want+1".
As mkeintz pointed out , "want+1" already included statement "retain want 0" .

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 494 views
  • 3 likes
  • 3 in conversation