Hi,
I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).
I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne . or dod ne .;
run;
I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne .;
run;
DATA CLEAN.DEATH_278SUBJECTS; *n=587*;
SET CLEAN.DEATH_BOTH;
WHERE dod ne .;
run;
Please help! Thank you.
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
change WHERE DEATH_DT ne . or dod ne .;
to WHERE DEATH_DT ne . AND dod ne .;
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.