BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mphqueens
Fluorite | Level 6

Hello! I am working on a dataset that contains patient ID numbers, biological sex (M, F, or U for unidentified), and a test date. I am looking to change all gender entries to the one on the most recent test date that corresponds with the patient ID. Some patients have been previously entered as U but have more recent entries that are M or F. However, in some cases there are both M and F entries associated with the ID but I want to assume the most recent test date is the correct gender. How can I code this? 

 

I would like to have the most possible Sex entries as M and F, with the most recent test date indicating the assumed truth. Some will have the most recent test date as U, therefore I would like to use the second most recent in those cases. 

 

An example: 

 

Data addsex;
Input ID TESTDATE SEX;
datalines; 
1 2010/01/01 M

1 2012/02/07 F 

2 2009/05/07 F

2 2012/06/09 U 

2 2007/12/12 F

3 2008/01/11 U 

3 2010/08/09 M 

4 2009/09/09 M

4 2012/09/03 M 

;

 

Thank you! 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

This approach assumes that your data set is already in sorted order by ID and TESTDATE:

 

data want;

do until (last.id);

   set have;

   by id;

   if sex in ('M', 'F') then most_recent = sex;

end;

do until (last.id);

   set have;

   by id;

   if most_recent in ('M', 'F') then sex = most_recent;

   output;

end;

drop most_recent;

run;

 

It might be overkill, but the program is careful to leave SEX unchanged when an ID has no values that are either 'M' or 'F'.

View solution in original post

2 REPLIES 2
Astounding
PROC Star

This approach assumes that your data set is already in sorted order by ID and TESTDATE:

 

data want;

do until (last.id);

   set have;

   by id;

   if sex in ('M', 'F') then most_recent = sex;

end;

do until (last.id);

   set have;

   by id;

   if most_recent in ('M', 'F') then sex = most_recent;

   output;

end;

drop most_recent;

run;

 

It might be overkill, but the program is careful to leave SEX unchanged when an ID has no values that are either 'M' or 'F'.

mphqueens
Fluorite | Level 6

This is perfect. Thank you!!! 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1092 views
  • 0 likes
  • 2 in conversation