I have a dataset of test scores. My variables are (test taker’s) ID_no, test date, test score, and test type (online or in person).
Some people took the test online and in person, and therefore, are in the dataset more than once.
I’d like to make a new dataset with only those whose ID_no shows up more than once. And then with that dataset, I’d like to clean it to only include the most recent test date for each ID_no.
I'm not super familiar with SQL and want to develop this skill. What kind of procedure would I need to use to achieve this?
Assuming that the people who took the test online and in person did not do this both on the same day, this should work
proc sql;
create table want as select * from have
group by ID_no
having test_date=max(test_date);
quit;
Assuming that the people who took the test online and in person did not do this both on the same day, this should work
proc sql;
create table want as select * from have
group by ID_no
having test_date=max(test_date);
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.