BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mphqueens
Fluorite | Level 6

I am looking for a solution to my current problem. 


SAS 9.4

I have a variable with patient IDs and another variable of gender. The patient IDs are entered multiple times but the corresponding genders are not always entered or are "unidentified" (U). I would like to find a way to change all the genders to the same if they fall under the same patient ID. 

 

Ex. 

 

Patient ID    Gender

104                 M

104                 U

104                 M

212                 F

212                 F

212                 U

212                 F

1102              M

1102               . 

 

Thank you! 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

To implement the policy : keep consistent gender assignment:

 

data have;
input Patient_ID    Gender $;
datalines;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102              M
1102               . 
999 M
999 F
999 .
999 U
;

proc sql;
create table gender as
select 
	patient_id, 
	case
		when count(distinct Gender) = 1 then max(Gender)
		else ""
		end as gender
from have
where Gender in ("M", "F")
group by patient_id;
select * from gender;
quit;
PG

View solution in original post

12 REPLIES 12
ballardw
Super User

Do you have any that look like this ?

104                 M

104                 U

104                 F

 

If so what rule would you like for selecting between M and F?

Astounding
PROC Star

These days, it's possible that all three values were valid at the time they were recorded.

mphqueens
Fluorite | Level 6

Not that I have found! It seems that they are consistent with either M or F with some U and blanks. 

 

mphqueens
Fluorite | Level 6

However, if I found any with M and F and U, I would assume the first recorded gender would be the correct one. 

Reeza
Super User

Do you have a date that goes along with the data? How can you identify which is the first/last besides order which isn't a good way.

 

Given the world today, it may be more appropriate to use the last gender, rather than first. 

mphqueens
Fluorite | Level 6
Yes, the data is ordered by date.
mphqueens
Fluorite | Level 6

I am looking for a solution to my current problem. 


SAS 9.4

I have a variable with patient IDs and another variable of gender. The patient IDs are entered multiple times but the corresponding genders are not always entered or are "unidentified" (U). I would like to find a way to change all the genders to the same if they fall under the same patient ID. 

 

Ex. 

 

Patient ID    Gender

104                 M

104                 U

104                 M

212                 F

212                 F

212                 U

212                 F

1102              M

1102               . 

 

Thank you! 

PGStats
Opal | Level 21

To implement the policy : keep consistent gender assignment:

 

data have;
input Patient_ID    Gender $;
datalines;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102              M
1102               . 
999 M
999 F
999 .
999 U
;

proc sql;
create table gender as
select 
	patient_id, 
	case
		when count(distinct Gender) = 1 then max(Gender)
		else ""
		end as gender
from have
where Gender in ("M", "F")
group by patient_id;
select * from gender;
quit;
PG
mphqueens
Fluorite | Level 6

This worked perfectly...thank you! Is there a way to merge this table back to my original data set?

Reeza
Super User

Here's a quick SQL query that will set all to the M/F observed assuming only one value is observed. If you have both a M/F for a single patient_id then that will break this code. You'll need to determine how to handle those cases if they occur in your data.

 

data have;
infile cards truncover;
input Patient_ID $   Gender $;
cards;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102                M
1102               
;
run; 

proc sql;
create table want as
select a.patient_id, b.gender
from have as a
left join (select distinct patient_id, gender from have where not missing(gender) and gender ne "U") as b
on a.patient_id=b.patient_id
order by 1, 2;
quit;
Astounding
PROC Star

If you're more comfortable with a DATA step approach, here's how it could be done:

 

data want;

length first_gender $ 1;

do until (last.patient_id);

   set have;

   by patient_id;

   if first_gender=' ' and gender in ('F', 'M') then first_gender = gender;

end;

if first_gender=' ' then first_gender=gender;

do until (last.patient_id);

   set have;

   by patient_id;

   gender = first_gender;

   output;

end;

drop first_gender;

run;

Ksharp
Super User
data have;
infile cards truncover;
input Patient_ID $   Gender $;
cards;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102                M
1102               
;
run; 
proc sort data=have(keep=Patient_ID Gender where=(Gender in ('F' 'M'))) out=temp nodupkey;
by Patient_ID Gender ;
run;
proc sort data=have;by Patient_ID ;run;
data want;
 merge have(drop=Gender) temp;
 by Patient_ID ;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2722 views
  • 5 likes
  • 6 in conversation