RETAIN statement help

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

RETAIN statement help

[ Edited ]

Hi everyone,

 

This is my first time posting a question to the SAS community, so my apologies ahead of time if it's unclear. I'm having some difficulty with the RETAIN statement in my SAS program.

 

My dataset includes thousands of emergency department records of people bitten by animals, and some of the patients have returned for multiple rechecks for the same bite incident. I only want to count the first time a patient is seen for a recheck visit. It gets complicated because I can't simply look for the first time a recheck is recorded for a given patient - sometimes a patient comes back for two sets of rechecks (maybe they get bit by multiple animals over the course of many years).

 

Currently bitevisit is coded as 1 or 3 for the initial incident visit and 2 for recheck visit. I figured that I would create a new variable called rechecksamevisit to separate recheck visits.

 

I’m trying to assign a value to a new variable “rechecksamevisit” based on the value of another variable “bitevisit” in the PREVIOUS observation. If the previous observation shows a value of 2 for “bitevisit,” then I’ll know that it’s a subsequent recheck visit and will be disregarded later on. However, when I look at my new dataset, it seems to be setting the value of “rechecksamevisit” based on the value of “bitevisit” in the SAME observation. I’ve tried the coding a couple different ways, but no luck. Help!!!

 

data ed.bite_testrecheck;
      set ed.bite_byvisit;
      retain holdbitevisit;
      if bitevisit=2 then rechecksamevisit=1;
      if bitevisit=1 or bitevisit=3 then rechecksamevisit=0;
      holdbitevisit=bitevisit;
      output;
run;
 

 


Accepted Solutions
Solution
‎12-19-2016 12:48 PM
Super User
Posts: 789

Re: RETAIN statement help

We're getting a little off topic here, but I think that's just too absolutist to leave without a response.  Understanding LAG as a queue managment function as opposed to a simpliistic "lookback" retrieval provides the SAS user an exceedingly powerful tool.  Avoiding lag would generate an awful lot of RETAIN programming for some otherwise simple problems.

 

Consider a monthy dataset from which you want year-over-year comparisons of X.  It's quite easy to do with LAG:

    year_over_year=  X/lag12(x);

 

For by values (say by-variable STOCK_TICKER), it's

     year_over_year=ifn(lag12(stock_ticker)=stock_ticker,x/lag12(x),.);

 

And as to the effect of deleting observations on the results of LAG, it depends (as it should) on whether the lag function precedes or succeeds the delete statement.

 

In short, I use lag "because I know exactly how it works, and I can control it precisely".

 

View solution in original post


All Replies
Grand Advisor
Posts: 17,346

Re: RETAIN statement help

Can you include some sample data to work with, and the corresponding output?

Esteemed Advisor
Posts: 6,661

Re: RETAIN statement help

You do not use holdbitevisit for anything, so it is useless.

To get values from the previous observation, you can also use the lag() function.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 251

Re: RETAIN statement help

This should do something close to what you want. In your code you were never using holdbitevisit.

 

I've refactored your if statements into a select block, just to make things a wee bit clearer.

 

Of course, not having your dataset to check the code, I've written this off the top of my head - there's a reasonable chance that I've stuffed something up! I hope not, but you never know…

data ed.bite_testrecheck;
      set ed.bite_byvisit;
      retain holdbitevisit;
      select holdbitevisit;
           when (2) 
                  rechecksamevisit = 1;
           when (1, 3) 
                  rechecksamevisit = 0;
           otherwise;
           end;
      holdbitevisit = bitevisit;
      drop holdbitevisit;   /* Don't drop first time to for debugging purposes */
run;

One thing to look out for is that if your data are grouped, you may end up giving rechecksamevisit a value on a change of group when you don't want to. If this is the case, you'll need a by statement, and reset holdbitevisit on change of group. It's easy to do; sing out if you want help with that.

 

Additionally, if bitevisit can contain a value outside 1-3, you'll need to put something in the otherwise statement. Leaving it blank like this stops warning and error messages cropping up.

Super User
Posts: 789

Re: RETAIN statement help

[ Edited ]

I believe you are saying that all the records for a given incident are consecutive.   I.e. a revisit for incident 1 will never follow an initial visit for incident 2.  If so, then there is no need for RETAIN ... use the lag function.

 

data ed.bite_testrecheck;
      set ed.bite_byvisit;

      if lag(bitevisit) in (1,3) and bitevisit=2 then rechecksamevisit=0;

      else if bitevisit=2 then rechecksamevisit=1;

run;

 

rechecksamevisit will be

  • missing for non-"2" records,
  • 0 for the first 2 immediately following a 1 or
  • 1 for all other 2's

 Edit to address @LaurieF comment about id change. Modify the IF statement to

 

     if lag(id)=id and lag(bitevisit) in (1,3) and bitevisit=2 then rechecksamevisit=0;

 

 I don't understand the comment about dropped observations. Would appreciate some clarification.

 

Super Contributor
Posts: 251

Re: RETAIN statement help

I concur with using the lag function - up to a point. If this dataset is patient-based, on the first observation for a new patient the lag statement will be referring the previous patient's data: hence my comment about using a by statement and initialising the retained variable on the first instance.

 

Additionally, if the code becomes more complicated so that an observation is dropped prior to the lag statement being processed, the logic will become faulty.

Super Contributor
Posts: 251

Re: RETAIN statement help

The first point would be better addressed with by group. And the reason for this is the inherent problem with lag and dif, and why I rarely use them (because I keep on stuffing it up and spending ages debugging): if you do a (logical/physical) delete on the second observation in a group and use the lag function on the third observation, the lag'ed value will refer to the first observation. This is because it works on a queue of referenced values, not an actual observation.

 

This is why I use retain, because I know exactly how it works, and I can control it precisely.

Solution
‎12-19-2016 12:48 PM
Super User
Posts: 789

Re: RETAIN statement help

We're getting a little off topic here, but I think that's just too absolutist to leave without a response.  Understanding LAG as a queue managment function as opposed to a simpliistic "lookback" retrieval provides the SAS user an exceedingly powerful tool.  Avoiding lag would generate an awful lot of RETAIN programming for some otherwise simple problems.

 

Consider a monthy dataset from which you want year-over-year comparisons of X.  It's quite easy to do with LAG:

    year_over_year=  X/lag12(x);

 

For by values (say by-variable STOCK_TICKER), it's

     year_over_year=ifn(lag12(stock_ticker)=stock_ticker,x/lag12(x),.);

 

And as to the effect of deleting observations on the results of LAG, it depends (as it should) on whether the lag function precedes or succeeds the delete statement.

 

In short, I use lag "because I know exactly how it works, and I can control it precisely".

 

New Contributor
Posts: 2

Re: RETAIN statement help

Thanks everyone for your helpful comments! I went ahead and solved my problem with a lag statement. This is was great having so many responses from the SAS community. Thanks again!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 213 views
  • 1 like
  • 5 in conversation