BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

Hi,

 

I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).

 

I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":

 

DATA CLEAN.DEATH_278SUBJECTS; *n=674*;

SET CLEAN.DEATH_BOTH;

WHERE DEATH_DT ne . or dod ne .;

run;

 

I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:

 

DATA CLEAN.DEATH_278SUBJECTS; *n=674*;

SET CLEAN.DEATH_BOTH;

WHERE DEATH_DT ne .;

run;

DATA CLEAN.DEATH_278SUBJECTS; *n=587*;

SET CLEAN.DEATH_BOTH;

WHERE dod ne .;

run;

 

Please help! Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Kels123
Quartz | Level 8

I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.

 

DATA CLEAN.DEATH_BOTH;

SET CLEAN.DEATH_BOTH;

if death_dt = . and dod = . then death_missing = 1;

else death_missing = 0;

run;

 

DATA CLEAN.DEATH_NOTMISSING;

SET CLEAN.DEATH_BOTH;

where death_missing = 0 ;

run;

View solution in original post

2 REPLIES 2
GreggB
Pyrite | Level 9

change WHERE DEATH_DT ne . or dod ne .;

to WHERE DEATH_DT ne .  AND dod ne .;

Kels123
Quartz | Level 8

I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.

 

DATA CLEAN.DEATH_BOTH;

SET CLEAN.DEATH_BOTH;

if death_dt = . and dod = . then death_missing = 1;

else death_missing = 0;

run;

 

DATA CLEAN.DEATH_NOTMISSING;

SET CLEAN.DEATH_BOTH;

where death_missing = 0 ;

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1603 views
  • 0 likes
  • 2 in conversation