Hello experts,
I would greatly appreciate your help with this issue I am having. I need to keep the first occurrence of code change that happens in my dataset. I have created a sample of the data I am working with below. For ID 001 we want the SEQ 3 for it. This is the first time they were changed to their current code which you can see they are still on with their recent "review" from observation 1 and date of 05/04/2014. For 002 we would need to keep SEQ 2 for them.
I have tried to sort and duplicate out the observation not in the first group of 'ST' but the issue I have is if in their history they were on that code before the older dates get mixed in and those are two separate occurrences. I have no data point to separate the occurrence other than a different status happening in between, such as we see with 'GP' on observation 4. I also tried to count the number of ST's in the group and stop when it reaches GP but I had no such look due to needing a by statement and the sort has the issues mentioned above.
In short, I need a piece of code that can identify the first occurrence of their most recent Code, here it is ST but I also have SP and DD. For ID 001 observation line 3 is the one needed to keep and for 002 observation line 7 is the one to keep.
Thanks for any help I can receive.
P.s. Do note the format of the date is ddmmyy.
data want;
input ID SEQ Code $ date :ddmmyy10.;
format date ddmmyy10.;
datalines;
001 1 ST 05/04/2014
001 2 ST 05/01/2014
001 3 ST 04/01/2014
001 4 GP 02/05/2014
001 5 ST 02/01/2014
002 1 ST 05/01/2014
002 2 ST 03/02/2014
002 3 GP 02/01/2014
002 4 GP 02/01/2014
;
run;
I don't see your mentioned sample data?
Apologies. The insert code button was not working. I updated my post by just pasting my sample data at the bottom.
So your desired result here is a data set with 2 observations, correct?
Correct I should end with observation 3 and 7. Mind you the full dataset is over 500,000 individuals so doing this manually was not an option.
Thanks!
If you use the NOTSORTED option (to accomodate "CODE" value that might either ascend or descend, then
data have;
input ID SEQ Code $ date :ddmmyy10.;
format date ddmmyy10.;
datalines;
001 1 ST 05/04/2014
001 2 ST 05/01/2014
001 3 ST 04/01/2014
001 4 GP 02/05/2014
001 5 ST 02/01/2014
002 1 ST 05/01/2014
002 2 ST 03/02/2014
002 3 GP 02/01/2014
002 4 GP 02/01/2014
run;
data want (drop=_:) ;
set have;
by id code notsorted;
if first.id=1 then _code_group=0;
_code_group + (last.code=1);
if last.code and _code_group=1;
run;
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.