ID | STATUS | LAG STATUS |
1 | active | - |
1 | active | active |
1 | inactive | active |
2 | active | - |
2 | inactive | active |
3 | inactive | - |
3 | active | inactive |
3 | inactive | active |
3 | inactive | inactive |
4 | inactive | - |
For each unique ID, where the last status is inactive I want the previous status that's not inactive returned. The lag function works when there are only two records for a given ID (as seen in ID2) or when there are multiple records for a given ID where the previous status before the inactive is something other than inactive (as seen in ID1). What I would like to do is return the value before an inactive status that is a status other than inactive. So for ID3, I would like the active status returned since that was the previous status before an inactive status.
Since you need to look back an undetermined number of rows then perhaps you want RETAIN instead of Lag.
Please consider:
data have; input ID STATUS :$10.; datalines; 1 status1 1 status2 1 inactive 2 status1 2 inactive 3 inactive 3 status1 3 inactive 3 inactive 4 inactive ; data want; set have; by id; length alt $ 10; retain alt; if first.id then call missing(alt); if status ne 'inactive' then alt=status; run;
I made some different values of "status" to show which "active" you get in the result when more than one occur.
You only have active and inactive in this table. Do you have other status? If so, you may want to expand your data. Otherwise, you can just check if each ID has more than one status and return active because that's what your problem simplifies to.
It helps if you make it clear which variables you're starting with and showing what you want as the output. If the above is the input, what do you want as output.
ID | STATUS | LAG STATUS | Desired | |
1 | active | - | ||
1 | active | active | ||
1 | inactive | active | active | |
2 | active | - | ||
2 | inactive | active | active | |
3 | inactive | - | ||
3 | active | inactive | ||
3 | inactive | active | ||
3 | inactive | inactive | active | |
4 | inactive | - | ||
4 | active | inactive | ||
4 | urgent | active | ||
4 | inactive | urgent | ||
4 | inactive | inactive | urgent |
I am starting out with the ID and Status fields. I created a new variable called lag status to capture the previous value for each ID. I am trying to figure out which function I could use to capture the previous value of the status field when it isn't equal to the current value of the field. There can be other value types other than inactive. The desired column shows which values I want returned. So for ID3, I want active returned since it was the previous value before inactive. For ID4, I want urgent returned since it's the previous value before the inactive.
Do you need to keep the detailed records, or do you want to reduce to just one observation with "max" value? If you keep the detailed records do you want to calculated "max" value to appear on every observation for the group (with the same value)? Or only appear on the last observation, like in your example?
No, I want to return the previous status that's not inactive. So I was asked to provide a list of all patients with an inactive status. I am then asked, of those patients what was the status before their status was changed to inactive. The retain function that @ballardw recommend worked when I tweaked it a little. I am just trying to articulate what's going on behind the scenes so I can share. But I guess I'll go ahead and mark it solved.
Hi and welcome to the SAS Communities 🙂
Can you show us what your desired result looks like from your posted sample data (and thank you for posting that)
What if no previous status exist other than 'Inactive'?
Since you need to look back an undetermined number of rows then perhaps you want RETAIN instead of Lag.
Please consider:
data have; input ID STATUS :$10.; datalines; 1 status1 1 status2 1 inactive 2 status1 2 inactive 3 inactive 3 status1 3 inactive 3 inactive 4 inactive ; data want; set have; by id; length alt $ 10; retain alt; if first.id then call missing(alt); if status ne 'inactive' then alt=status; run;
I made some different values of "status" to show which "active" you get in the result when more than one occur.
Thank you! I am getting closer. In using this I notice that it erases the current status. Is there a way to get it not to do this?
This is the code:
data want;
set have;
retain status;
if last.id then call missing(status);
if status ne 'inactive' then alt=lag(status);
if _n_=1 then alt=’ ‘;
run;
This is what I see VS desired:
ID | STATUS | ALT |
|
|
| ID | STATUS | ALT |
1 | active |
|
|
| 1 | active |
| |
1 | active |
|
|
| 1 | active |
| |
1 | active |
|
|
| 1 | inactive | active | |
2 | active |
|
|
| 2 | active |
| |
2 | active |
|
|
| 2 | inactive | active | |
3 | active |
|
|
| 3 | active |
| |
3 | inactive |
|
|
| 3 | inactive |
| |
3 | active | active |
|
|
| 3 | active | active |
3 | active |
|
|
| 3 | inactive | active |
Thank you! I just created a new variable for my retain statement, that way I can keep each of the status field values.
@ballardw I have read the retain function but I am not quite sure I understand what is going on behind the scenes and was hoping you could assist me. The code below is doing exactly what I need it to do but I am not sure what is going on behind the scenes at lines 5 & 6 below. The code below returns the status right before an inactive status (which are documented as 4099, 4999) for each ID.
1) data want;
2) set have;
3) by ID STATUS_DATE STATUS_CD;
4) retain STATUS_CD;
5) if last.ID then call missing(STATUS_CD);
6) if STATUS_CD not in (4099,4999) then STAT_LAG=lag(STATUS_CD);
7) run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.