BookmarkSubscribeRSS Feed
zmalone91
Calcite | Level 5

I am using the lag function to identify when the same MemberID has a new AdmitID (002) and creating "Ignore" flag on that line. This part works for the first instance of the new AdmitID but I need all lines that are AdmitID = 002 to also say ignore as I am conditionally summing the data later. I tried adding a second if statement but have had no luck copying down the "ignore" where AdmitID = 002. Any help or suggestions for a better approach are appreciated/

 

data want; set have;
format maternity $char8.;

if MemberID = lag(MemberID)
and AdmitID ^= lag(AdmitID)
and dt_incurred >= lag(dt_adm)
and dt_incurred <= lag(dt_dischg) then Maternity = "Ignore";

 

if lag(Maternity) = "Ignore" and AdmitID=lag(AdmitID) then Maternity=lag(Maternity);
run;

 

This is what I am getting:

MemberIDAdmitIDMaternity
AAA001 
AAA001 
AAA001 
AAA002Ignore
AAA002 
AAA002 

 

but this is what I am after:

 

MemberIDAdmitIDMaternity
AAA001 
AAA001 
AAA001 
AAA002Ignore
AAA002Ignore
AAA002Ignore

 

5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT;
  set HAVE;
  format MATERNITY $char8.;
  if MEMBERID     = lag(MEMBERID)
   & ADMITID     ^= lag(ADMITID)
 /*  & DT_INCURRED >= lag(DT_ADM)
   & DT_INCURRED <= lag(DT_DISCHG) */then do;
     IGNORE+1;
     MATERNITY = "Ignore";
   end;  
   if MEMBERID ne lag(MEMBERID) then IGNORE=0; 
   if IGNORE and ADMITID = lag(ADMITID)  then Maternity='Ignore';
run;

 

Alexandr
Fluorite | Level 6
Don't use LAG inside condition: ...then Maternity=lag(Maternity);...
LAG return value from previos LAG calling, not from previos observation.
Kurt_Bremser
Super User

"Conditional lag" immediately rings a very loud alarm bell.

The LAG() functions build a FIFO queue where a new element is added ONLY when the function is actually called, not when a data step iteration is executed or an observation is read or written.

The proper method for conditionally getting lagged values is to always (unconditionally) store the result of the function in a temporary variable (that you drop later), and use this variable in your conditional branch:

l_Maternity = lag(Maternity);
if l_Maternity = "Ignore" and AdmitID = lag(AdmitID) then Maternity = l_Maternity;
drop l_Maternity;

Using the function in a condition is not critical, as SAS does not optimize the conditions. In the above example, the lag(AdmitID) will be executed even if l_Maternity is not equal to "Ignore".

 

For the sake of efficiency (read: execution speed), it is always better to execute the function only once for each needed variable in a data step and use the temporary variables. Function calls are costly in terms of CPU cycles.

s_lassen
Meteorite | Level 14

You should not use LAG for this. Retain the Maternity variable instead, much easier:

data want; 
  set have;
  by MemberID AdmitID notsorted;
  format maternity $char8.;
  retain maternity;
  if not first.MemberID and first.AdmitID
    and dt_incurred >= lag(dt_adm)
    and dt_incurred <= lag(dt_dischg) then Maternity = "Ignore";
  else if first.AdmitID then Maternity=' ';
run;

Instead of checking the ID variables with LAG, just use BY and FIRST. I put in a NOTSORTED option on the by statement in case the data is actually sorted by something else. 

 

The code looks a bit ineffective, as we check twice for the same condition (first.AdmitID), but as others have noted, it is important to make sure that the LAG function is called for every single observation.

 

 

 

 

Tom
Super User Tom
Super User

You don't need to use LAG.  Use BY group processing and RETAIN instead.

data have;
  input MemberID $ AdmitID $ ;
cards;
AAA 001  
AAA 001  
AAA 001  
AAA 002
AAA 002  
AAA 002
;

data want;
  set have ;
  by memberid admitid;
  length Maternity $8 ;
  if first.memberid then Maternity=' ';
  else if first.admitid then Maternity='Ignore';
  retain Maternity;
run;
       Member    Admit
Obs      ID       ID      Maternity

 1      AAA       001
 2      AAA       001
 3      AAA       001
 4      AAA       002      Ignore
 5      AAA       002      Ignore
 6      AAA       002      Ignore

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1474 views
  • 2 likes
  • 6 in conversation