Hi,
I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).
I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne . or dod ne .;
run;
I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne .;
run;
DATA CLEAN.DEATH_278SUBJECTS; *n=587*;
SET CLEAN.DEATH_BOTH;
WHERE dod ne .;
run;
Please help! Thank you.
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
change WHERE DEATH_DT ne . or dod ne .;
to WHERE DEATH_DT ne . AND dod ne .;
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.