BookmarkSubscribeRSS Feed
greg6363
Obsidian | Level 7

I am creating summary analysis for records with a particular event denoted as 'X1'.  I want create one data set with records that have consecutive X1 events as well as a second data set with records that have a X1 occurrence but in non-consecutive instances which would serve as a total record set that captures all X1 events.  I created three counter fields: the first for the record iteration, the second counter starts and stops with each X1 event and the third counter counts each X1 event but reverts to zero when no X1 event is present.  When I attempt to group the consecutive and non-consecutive events separately, the output makes no sense since the totals for the first instance of X1 (which should be the same whether they are consecutive or non-consecutive) are different between the two data sets.  Here is the code for the three counter fields:

 

/*Creating a counter field to track the number of records for each individual account*/
data Step2;
set Step1;
by accountnumber;
if first.accountnumber
then acct_int = 1;
else acct_int + 1; /*iterative statement*/
run;

 

/*Creating a second counter field that starts and stops with each X1 event - Consecutive*/
data Step3 (drop=_n:);
set Step2;
by accountnumber;
if first.accountnumber then _n=0;
_n+event_type='X1';
event_int_1=ifn(event_type='X1',_n,0);
run;

 

/*Creating a third counter field that resets to zero for each non-consecutive X1*/
data Step4 (rename=(_n=event_int_2));
set Step3;
by accountnumber;
if first.accountnumber then _n=0;
if event_type='Non-X1' then _n=0;
else _n +1;
run;

 

The output will have this layout:

 

accountnumber  event_type    acct_int  event_int_1  event_int_2

         001                Non-X1           1              0                   0

         001                     X1              2              1                   1

         001                Non-X1           3               1                  0

         001                     X1              4               2                  1

         001                     X1              5               3                  2

         002                     X1              1               1                  1

         002                     X1              2               2                  2

         002                Non-X1           3               2                  0

         002                     X1              4               3                  1

         003                Non-X1           1               0                  0     

         004                     X1              1               1                  1 

         005                     X1              1               1                  1

         005                Non-X1           2                1                 0

         006                Non-X1           1                0                 0

         006                Non-X1           2                0                 0

         006                    X1               3                1                 1

 

When I use the GROUP BY statement using the event_1 and event_2 variables in PROC SQL, the totals don't make sense.  Am I properly identifying the consecutive and non-consecutive records with my counters?  Any assistance would be greatly appreciated in this issue.  Thanks.

2 REPLIES 2
mkeintz
PROC Star

You can do this simply in a single data step, using what sas calls SUM statements, where the statement

   x+y;

means

  1. Sum the values of X and Y, and store the result in X.  (i.e. X=sum(x,y));
  2. Retain the resulting value for the next iteration of the data step.

For example, (untested):

data want;
  set have;
  by accountnumber ;
  if first.accountnumber then call missing (acct_int,event_int_1,event_int_2);

  acct_int+1;

  event_int_1+(event_type='X');

  if event_type^='X' then event_int_2=0;
  else event_int_2+1;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
greg6363
Obsidian | Level 7

I rewrote my code and created a data step to include a sum statement:

 

DATA master;
INPUT accountnumber event_type $ acct_int event_int_1 event_int_2 ;
DATALINES ;
1001 Non-X1 1 0 0
1001 X1 2 1 1
1001 Non-X1 3 1 0
1001 X1 4 2 1
1001 X1 5 3 2
1002 X1 1 1 1
1002 X1 2 2 2
1002 Non-X1 3 2 0
1002 X1 4 3 1
1003 Non-X1 1 0 0
1004 X1 1 1 1
1005 X1 1 1 1
1005 Non-X1 2 1 0
1006 Non-X1 1 0 0
1006 Non-X1 2 0 0
1006 X1 3 1 1; RUN;

data step1;
set master;
X=sum(event_int_1,event_int_2);
run;

 

data step2;
set step1;
by accountnumber ;
if first.accountnumber then call missing (acct_int,event_int_1,event_int_2);

acct_int+1;

event_int_1+(event_type='X');

if event_type^='X' then event_int_2=0;
else event_int_2+1;
run;

 

Unfortunately, the 2 value for the 2nd iteration in the acct_int field has disappeared when using the acct_int+1 statement.  I don't want that field value to change.  Here is my output when producing the step2 data set:

 

accountnumber  event_type  acct_int  event_int_1  event_int_2  X

1001 Non-X1 1 0 0 0 
1001 X1 3 1 0 2 
1001 Non-X1 4 1 0 0 
1001 X1 5 2 0 3  
1001 X1 6 3 0 5 
1002 X1 1 0 0 2  
1002 X1 3 2 0 4 
1002 Non-X1 4 2 0 2
1002 X1 5 3 0 4 
1003 Non-X1 1 0 0 0  
1004 X1 1 0 0 2 
1005 X1 1 0 0 2  
1005 Non-X1 3 1 0 1 
1006 Non-X1 1 0 0 0 
1006 Non-X1 3 0 0 0 
1006 X1 4 1 0 2 

 

Let me know what I'm doing wrong.  Thanks.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 449 views
  • 0 likes
  • 2 in conversation