Hi,
I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).
I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne . or dod ne .;
run;
I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne .;
run;
DATA CLEAN.DEATH_278SUBJECTS; *n=587*;
SET CLEAN.DEATH_BOTH;
WHERE dod ne .;
run;
Please help! Thank you.
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
change WHERE DEATH_DT ne . or dod ne .;
to WHERE DEATH_DT ne . AND dod ne .;
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.