Okay, this is a fairly long one. I appreciate your help in advance. I usually have no problems with retain, but this is the first time I've tried it with more than one by-variable. I have a dataset of incidents, and am trying to identify repeated incidents among individuals. I have several variables which are used to identify unique individuals, so I need to retain information across these variables (uniqueid, ptdob, sex) in order to calculate the variable diff (which is the time difference between incidents). I create a variable base which is the first incident, and casenum is the counter for all incidents for that person. data new; set old; format base mmddyy8.; by uniqueid ptdob sex admitdte; retain base old_uniqueid old_ptdob old_sex casenum; if (first.uniqueid=1 and first.ptdob=1 and first.admitdte=1 and first.sex=1) then do; old_uniqueid=uniqueid; old_ptdob=ptdob; old_sex=sex; base=admitdte; casenum=0; end; if (uniqueid=old_uniqueid and ptdob=old_ptdob and sex=old_sex) then do; diff=admitdte-base; base=admitdte; casenum=casenum+1; end; run; proc print; var uniqueid ptdob sex old_uniqueid ptdob old_sex base admitdte diff casenum; run; Some of the printed output: Obs UNIQUEID PTDOB SEX old_uniqueid old_ptdob old_sex base ADMITDTE diff casenum Diff and casenum okay here, where there are multiple incidents for one individual: 16 000CDC785C4353DB15AC 19531105 2 000CDC785C4353DB15AC 19531105 2 02/04/05 02/04/05 0 1 17 000CDC785C4353DB15AC 19531105 2 000CDC785C4353DB15AC 19531105 2 03/02/06 03/02/06 391 2 18 000CDC785C4353DB15AC 19531105 2 000CDC785C4353DB15AC 19531105 2 04/02/06 04/02/06 31 3 19 000CDC785C4353DB15AC 19531105 2 000CDC785C4353DB15AC 19531105 2 07/21/06 07/21/06 110 4 And here, where there is only one incident per person (3 different people here) 22 00122E6ADA905AB34ABC 19760807 2 00122E6ADA905AB34ABC 19760807 2 09/18/05 09/18/05 0 1 23 0013ADECDDB3712DB941 19660205 2 0013ADECDDB3712DB941 19660205 2 08/10/06 08/10/06 0 1 24 0013C9B92BCCEE96FED7 19301005 2 0013C9B92BCCEE96FED7 19301005 2 01/22/06 01/22/06 0 1 But not here, where individuals share one of the 4 variables (uniqueid) in the by-variable but not others... 74 0044634429982447D498 19750606 1 0044634429982447D498 19700128 1 09/25/06 05/13/05 . 1 75 0044634429982447D498 19970326 2 0044634429982447D498 19700128 1 09/25/06 02/22/05 . 1 76 0044634429982447D498 19990117 2 0044634429982447D498 19700128 1 09/25/06 11/18/05 . 1 77 0044634429982447D498 19990117 2 0044634429982447D498 19700128 1 09/25/06 12/18/06 . 1 78 0044634429982447D498 20010523 2 0044634429982447D498 19700128 1 09/25/06 09/05/06 . 1 79 0044634429982447D498 20020319 1 0044634429982447D498 19700128 1 09/25/06 02/14/05 . 1 I think I understand why diff would be . in the first iteration - since it is calculated from the retained value of base. But since they are different people, I don't get why the processing is ocurring as if they were the same person. My problem is that I need to cover ALL of the following scenarios: 1) the first instance of the identifier (first.uniqueid, first.ptdob, first.admitdte, first.sex). Taken care of with the code above. 2) if first and last instance of all 4 identifying variables (i.e., only one incident per person). Also taken care of above, could add in some specific code if needed like: if (first.uniqueid and first.ptdob and first.admitdte and first.sex and last.uniqueid and last.ptdob and last.admitdte and last.sex) then do; 3) if first instance of only one of the identifying variables, but not the first instance of the others >>e.g., if first.uniqueid=1 and first.ptdob=0, etc. ??? Scenario 3 is my problem. I've tried all sorts of combinations of first. and last. processing with 1 and 0, and am still not getting the right values for diff and casenum. Thanks! RJ if (first.uniqueid and first.ptdob and first.admitdte and first.sex and last.uniqueid and last.ptdob and last.admitdte and last.sex) then do;
... View more