Hi,
I merged two databases based on date of death to give myself a more complete mortality database (I was worried that some deaths were only recorded in one database but not the other). Now I want to subset this database so that if BOTH variables are missing for the same observation, that observation will be dropped. This is the same as saying that I don't have a recorded data of death in either mortality database, so I will assume that the person has not yet died and does not belong in my mortality database (should be dropped).
I tried the following code, but I am not sure it translates to "drop the observation if both death_dt and dod are missing, otherwise keep":
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne . or dod ne .;
run;
I also did some exploring to see if I could figure out how the coding was working in terms of reducing the sample size:
DATA CLEAN.DEATH_278SUBJECTS; *n=674*;
SET CLEAN.DEATH_BOTH;
WHERE DEATH_DT ne .;
run;
DATA CLEAN.DEATH_278SUBJECTS; *n=587*;
SET CLEAN.DEATH_BOTH;
WHERE dod ne .;
run;
Please help! Thank you.
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
change WHERE DEATH_DT ne . or dod ne .;
to WHERE DEATH_DT ne . AND dod ne .;
I think the "where death_dt ne . AND dod ne .;" statement ends up excluding all observations with EITHER a missing death_dt or missing dod, which is not what I am looking to do. But this made me realize I was overthinking it. There may be a shorter way, but this code seems to work well by creating a new variable to work with.
DATA CLEAN.DEATH_BOTH;
SET CLEAN.DEATH_BOTH;
if death_dt = . and dod = . then death_missing = 1;
else death_missing = 0;
run;
DATA CLEAN.DEATH_NOTMISSING;
SET CLEAN.DEATH_BOTH;
where death_missing = 0 ;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.