BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
v2murthy
Fluorite | Level 6

Hi team - hoping to get some help with the below issue. Essentially, I want to create a new variable that keeps the most recent value based on another variable, based on date. See below:

 

This is the data I have:

 

PATIENT_ID     LAB_RESULT       DATE

1                         A                           1/1/22

1                         B                           2/1/22

2                         C                           1/1/22

2                         D                           2/1/22

2                         E                           3/1/22

3                         F                           1/1/22

 

This is the data I want:

 

PATIENT_ID     LAB_RESULT       DATE         LAB_RECENT

1                         A                           1/1/22        

1                         B                           2/1/22            B

2                         C                          1/1/22

2                         D                           2/1/22

2                         E                           3/1/22            E

3                         F                           1/1/22            F

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input PATIENT_ID LAB_RESULT $ DATE:ddmmyy10.;
format date ddmmyy10.;
datalines;
1 A 1/1/22
1 B 2/1/22
2 C 1/1/22
2 D 2/1/22
2 E 3/1/22
3 F 1/1/22
;

data want;
   set have;
   by PATIENT_ID;
   if last.PATIENT_ID then LAB_RECENT = LAB_RESULT;
run;

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Is this data set always sorted by PATIENT_ID and DATE? If yes, then:

 

data want;
    set have;
    by patient_id;
    if last.patient_id then lab_recent=lab_result;
run;
--
Paige Miller
v2murthy
Fluorite | Level 6

Thank you. I have a wrench to add, though. What if the whole thing is dependent on a flag variable being checked. For example, below, the 3/1/22 lab result for Patient_ID 2 would not be considered in the new variable. 

 

This is the new have/want situation:

 

Have:

Patient_IDLab_ResultLab_FlagDate
1AY1/1/2022
1BY2/1/2022
2CY1/1/2022
2DY2/1/2022
2E 3/1/2022
3FY1/1/2022

 

Want:

Patient_IDLab_ResultLab_FlagDateLab_Recent
1AY1/1/2022 
1BY2/1/2022B
2CY1/1/2022 
2DY2/1/2022D
2E 3/1/2022 
3FY1/1/2022F
PeterClemmensen
Tourmaline | Level 20

Makes it a bit more complicated, but try this

 

data have;
input PATIENT_ID LAB_RESULT $ Lab_Flag $ DATE:ddmmyy10.;
format DATE ddmmyy10.;
infile datalines dlm = '|' dsd;
datalines;
1|A|Y|1/1/22
1|B|Y|2/1/22
2|C|Y|1/1/22
2|D|Y|2/1/22
2|E| |3/1/22
3|F|Y|1/1/22
;

data want(drop = i n);
   do i = 1 by 1 until (last.PATIENT_ID);
      set have;
      by PATIENT_ID;
      if Lab_Flag = 'Y' then n = i;
   end;

   do i = 1 to i;
      set have;
      LAB_RECENT = ifc(i = n, LAB_RESULT, '');
      output;
   end;
run;

 

Result

 

PATIENT_ID  LAB_RESULT  Lab_Flag  DATE        LAB_RECENT
1           A           Y         01/01/2022  
1           B           Y         02/01/2022  B
2           C           Y         01/01/2022  
2           D           Y         02/01/2022  D
2           E                     03/01/2022  
3           F           Y         01/01/2022  F
PaigeMiller
Diamond | Level 26

Simple, don't include rows in the analysis if they don't have Y

 

data want;
    set have(where=(lab_flag='Y'));
    by patient_id;
    if last.patient_id then lab_recent=lab_result;
run;
--
Paige Miller
v2murthy
Fluorite | Level 6

Thanks for the quick responses! 

 

Wouldn't the new table in this solution include only rows where Lab_flag = "Y"? I want all the original rows in the resulting table. 

PaigeMiller
Diamond | Level 26

@v2murthy wrote:

Thanks for the quick responses! 

 

Wouldn't the new table in this solution include only rows where Lab_flag = "Y"? I want all the original rows in the resulting table. 


Yes, that's correct, but one more step gives you what you want if you merge the original data with the result from my code.

--
Paige Miller
v2murthy
Fluorite | Level 6

Thank you. This has helped immensely. 

 

What about if I have two conditions in the where clause? what would be the correct syntax?

 

For example, instead of 

data want ;

set have (where = (Flag = "Y")) ;

run ;

 

I have two conditions.. I am trying to do this:

 

data want ;

set have (where = (Flag = "Y" and days >=30)) ;

run ; 

 

This doesn't seem to work. 

 

PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input PATIENT_ID LAB_RESULT $ DATE:ddmmyy10.;
format date ddmmyy10.;
datalines;
1 A 1/1/22
1 B 2/1/22
2 C 1/1/22
2 D 2/1/22
2 E 3/1/22
3 F 1/1/22
;

data want;
   set have;
   by PATIENT_ID;
   if last.PATIENT_ID then LAB_RECENT = LAB_RESULT;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2237 views
  • 6 likes
  • 3 in conversation