Hi,
I have 121 subjects in total and they have questionnaire data in 3 time points (baseline, 6 month and 12 month). I found subject ID = 2031 has two baseline data, and I want to delete the one has age, race, education level missing value. Could you please help me with the code? Thank you!
Obs | SubjectID | VISIT | Age | Race | HighestGrade | InterviewerID | TodaysDate | Quality_Of_Life |
1 | 2031 | Baseline | . | . | . | 28 | 1-Nov-12 | 8 |
2 | 2031 | Baseline | 55 | 4 | 7 | 28 | 1-Nov-12 | 9 |
data want;
set have;
by SubjectID;
if first.subjectid and last subjectid then output; else
if age=. or race=. or highestgrade=. then delete;
run;
try this
data have ; input id visit $ val1 val2 val3; datalines; 1 Baseline . . . 1 Baseline 2 3 4 1 faseline . . . 2 Baseline . . . ; proc sql; delete from have a where exists (select * from have b where a.id =b.id and a.visit='Baseline' group by id having count(id)> 1) and a.val1 = . and a.val2 =. and a.val3 = .;
Schmuel's approach would work, but might require some changes (depending on what the data looks like on the 6 month and 12 month visits). Assuming you have sorted your data set by SubjectID Visit DESCENDING Age:
data want;
set have;
by subjectID Visit descending Age;
if first.visit;
run;
If you are concerned about more than one duplicate, we might have to revisit how to subset properly.
Assuming, for each subject id, there is always exactly one record with non-missing age,race,highestgrade, then a single WHERE statement will work, no matter how many record is one or more of those variables missing:
data want;
set have;
where n(age,race,highestgrade)=3;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.