BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mphqueens
Fluorite | Level 6

I am looking for a solution to my current problem. 


SAS 9.4

I have a variable with patient IDs and another variable of gender. The patient IDs are entered multiple times but the corresponding genders are not always entered or are "unidentified" (U). I would like to find a way to change all the genders to the same if they fall under the same patient ID. 

 

Ex. 

 

Patient ID    Gender

104                 M

104                 U

104                 M

212                 F

212                 F

212                 U

212                 F

1102              M

1102               . 

 

Thank you! 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

To implement the policy : keep consistent gender assignment:

 

data have;
input Patient_ID    Gender $;
datalines;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102              M
1102               . 
999 M
999 F
999 .
999 U
;

proc sql;
create table gender as
select 
	patient_id, 
	case
		when count(distinct Gender) = 1 then max(Gender)
		else ""
		end as gender
from have
where Gender in ("M", "F")
group by patient_id;
select * from gender;
quit;
PG

View solution in original post

12 REPLIES 12
ballardw
Super User

Do you have any that look like this ?

104                 M

104                 U

104                 F

 

If so what rule would you like for selecting between M and F?

Astounding
PROC Star

These days, it's possible that all three values were valid at the time they were recorded.

mphqueens
Fluorite | Level 6

Not that I have found! It seems that they are consistent with either M or F with some U and blanks. 

 

mphqueens
Fluorite | Level 6

However, if I found any with M and F and U, I would assume the first recorded gender would be the correct one. 

Reeza
Super User

Do you have a date that goes along with the data? How can you identify which is the first/last besides order which isn't a good way.

 

Given the world today, it may be more appropriate to use the last gender, rather than first. 

mphqueens
Fluorite | Level 6
Yes, the data is ordered by date.
mphqueens
Fluorite | Level 6

I am looking for a solution to my current problem. 


SAS 9.4

I have a variable with patient IDs and another variable of gender. The patient IDs are entered multiple times but the corresponding genders are not always entered or are "unidentified" (U). I would like to find a way to change all the genders to the same if they fall under the same patient ID. 

 

Ex. 

 

Patient ID    Gender

104                 M

104                 U

104                 M

212                 F

212                 F

212                 U

212                 F

1102              M

1102               . 

 

Thank you! 

PGStats
Opal | Level 21

To implement the policy : keep consistent gender assignment:

 

data have;
input Patient_ID    Gender $;
datalines;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102              M
1102               . 
999 M
999 F
999 .
999 U
;

proc sql;
create table gender as
select 
	patient_id, 
	case
		when count(distinct Gender) = 1 then max(Gender)
		else ""
		end as gender
from have
where Gender in ("M", "F")
group by patient_id;
select * from gender;
quit;
PG
mphqueens
Fluorite | Level 6

This worked perfectly...thank you! Is there a way to merge this table back to my original data set?

Reeza
Super User

Here's a quick SQL query that will set all to the M/F observed assuming only one value is observed. If you have both a M/F for a single patient_id then that will break this code. You'll need to determine how to handle those cases if they occur in your data.

 

data have;
infile cards truncover;
input Patient_ID $   Gender $;
cards;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102                M
1102               
;
run; 

proc sql;
create table want as
select a.patient_id, b.gender
from have as a
left join (select distinct patient_id, gender from have where not missing(gender) and gender ne "U") as b
on a.patient_id=b.patient_id
order by 1, 2;
quit;
Astounding
PROC Star

If you're more comfortable with a DATA step approach, here's how it could be done:

 

data want;

length first_gender $ 1;

do until (last.patient_id);

   set have;

   by patient_id;

   if first_gender=' ' and gender in ('F', 'M') then first_gender = gender;

end;

if first_gender=' ' then first_gender=gender;

do until (last.patient_id);

   set have;

   by patient_id;

   gender = first_gender;

   output;

end;

drop first_gender;

run;

Ksharp
Super User
data have;
infile cards truncover;
input Patient_ID $   Gender $;
cards;
104                 M
104                 U
104                 M
212                 F
212                 F
212                 U
212                 F
1102                M
1102               
;
run; 
proc sort data=have(keep=Patient_ID Gender where=(Gender in ('F' 'M'))) out=temp nodupkey;
by Patient_ID Gender ;
run;
proc sort data=have;by Patient_ID ;run;
data want;
 merge have(drop=Gender) temp;
 by Patient_ID ;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1822 views
  • 5 likes
  • 6 in conversation