BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

Hi,

 

I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).

 

I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":

 

DATA CLEAN.DEATH_278SUBJECTS; *n=674*;

SET CLEAN.DEATH_BOTH;

WHERE DEATH_DT ne . or dod ne .;

run;

 

I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:

 

DATA CLEAN.DEATH_278SUBJECTS; *n=674*;

SET CLEAN.DEATH_BOTH;

WHERE DEATH_DT ne .;

run;

DATA CLEAN.DEATH_278SUBJECTS; *n=587*;

SET CLEAN.DEATH_BOTH;

WHERE dod ne .;

run;

 

Please help! Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Kels123
Quartz | Level 8

I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.

 

DATA CLEAN.DEATH_BOTH;

SET CLEAN.DEATH_BOTH;

if death_dt = . and dod = . then death_missing = 1;

else death_missing = 0;

run;

 

DATA CLEAN.DEATH_NOTMISSING;

SET CLEAN.DEATH_BOTH;

where death_missing = 0 ;

run;

View solution in original post

2 REPLIES 2
GreggB
Pyrite | Level 9

change WHERE DEATH_DT ne . or dod ne .;

to WHERE DEATH_DT ne .  AND dod ne .;

Kels123
Quartz | Level 8

I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.

 

DATA CLEAN.DEATH_BOTH;

SET CLEAN.DEATH_BOTH;

if death_dt = . and dod = . then death_missing = 1;

else death_missing = 0;

run;

 

DATA CLEAN.DEATH_NOTMISSING;

SET CLEAN.DEATH_BOTH;

where death_missing = 0 ;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 862 views
  • 0 likes
  • 2 in conversation