SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Modifying observation based on ID and test date

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Modifying observation based on ID and test date

Hello! I am working on a dataset that contains patient ID numbers, biological sex (M, F, or U for unidentified), and a test date. I am looking to change all gender entries to the one on the most recent test date that corresponds with the patient ID. Some patients have been previously entered as U but have more recent entries that are M or F. However, in some cases there are both M and F entries associated with the ID but I want to assume the most recent test date is the correct gender. How can I code this? 

 

I would like to have the most possible Sex entries as M and F, with the most recent test date indicating the assumed truth. Some will have the most recent test date as U, therefore I would like to use the second most recent in those cases. 

 

An example: 

 

Data addsex;
Input ID TESTDATE SEX;
datalines; 
1 2010/01/01 M

1 2012/02/07 F 

2 2009/05/07 F

2 2012/06/09 U 

2 2007/12/12 F

3 2008/01/11 U 

3 2010/08/09 M 

4 2009/09/09 M

4 2012/09/03 M 

;

 

Thank you! 


Accepted Solutions
Solution
‎05-23-2017 05:42 PM
Super User
Posts: 5,516

Re: Modifying observation based on ID and test date

Posted in reply to mphqueens

This approach assumes that your data set is already in sorted order by ID and TESTDATE:

 

data want;

do until (last.id);

   set have;

   by id;

   if sex in ('M', 'F') then most_recent = sex;

end;

do until (last.id);

   set have;

   by id;

   if most_recent in ('M', 'F') then sex = most_recent;

   output;

end;

drop most_recent;

run;

 

It might be overkill, but the program is careful to leave SEX unchanged when an ID has no values that are either 'M' or 'F'.

View solution in original post


All Replies
Solution
‎05-23-2017 05:42 PM
Super User
Posts: 5,516

Re: Modifying observation based on ID and test date

Posted in reply to mphqueens

This approach assumes that your data set is already in sorted order by ID and TESTDATE:

 

data want;

do until (last.id);

   set have;

   by id;

   if sex in ('M', 'F') then most_recent = sex;

end;

do until (last.id);

   set have;

   by id;

   if most_recent in ('M', 'F') then sex = most_recent;

   output;

end;

drop most_recent;

run;

 

It might be overkill, but the program is careful to leave SEX unchanged when an ID has no values that are either 'M' or 'F'.

Occasional Contributor
Posts: 15

Re: Modifying observation based on ID and test date

Posted in reply to Astounding

This is perfect. Thank you!!! 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 140 views
  • 0 likes
  • 2 in conversation