If the patient took the exam (exam="Yes"), and if there are duplicate records with the same visit date, I would like to keep only one record, but it seems my code deletes more records. How to correct this code? I think this code only kept the first date for each ID, but for each ID, there are multiple dates. I'm not sure how to write the correct code.
proc sort data=data;
by ID DATE exam;
run;
data data1;
set data;
by ID DATE;
if exam="No" then output;
else if exam="Yes" then do;
if first.DATE then output;
end;
run;
It is usually helpful to provide example of the data you have and the desired result.
This is a guess as two what you may want:
data data1; set data; by ID DATE exam; if exam="No" then output; else if exam="Yes" AND FIRST.EXAM then output; run;
Because you do not describe any rules for exam='No' the above will output ALL of the Exam='No' for a given date.
If you only want at most one No and one Yes then perhaps:
data data1; set data; by ID DATE exam; if FIRST.EXAM ; run;
You don't describe what the actual role of DATE plays in this very well or exactly what the requirement is. If you only want one date per ID you really need to provide example data and the result so we have a chance of following the hopefully expanded description of what date's role in the output data may be.
Assuming EXAM only takes the values "No" or "Yes", then
proc sort data=data;
by ID DATE exam;
run;
data data1;
set date;
by id date exam;
if exam='No' or first.exam=1;
run;
This gives you all the No's and just the first Yes for each ID/DATE.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.