i have a dataset that has two ids: id1 and id2. i want to find out if there are any duplicated records for the combined id1 and id2. i created a code like this but it don't work. are there any easy way to do this in sas proc sql?
data combine;
set dataset;
combinedid=catx(id1, id2);
run;
PROC SQL;
SELECT id1, id2,
FREQ(combinedid) AS dupe
FROM dataset
GROUP BY combinedid
HAVING dupe GE 2;
quit;
Run a count and use HAVING:
proc sql;
select id1, id2, count(*) as count
from have
group by id1, id2
having calculated count > 1;
quit;
Or run a proc freq or proc summary and filter out the observations with a count > 1.
Thank you all. Both proc sql and proc freq works with my case:
proc sql;
select id1, id2, count(*) as count
from have
group by id1, id2
having calculated count > 1;
quit;
PROC FREQ data=have;
TABLES id1*id2 / noprint out=duplist;
RUN;
PROC PRINT data=duplist;
WHERE count ge 2;
RUN;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.