Hi @Primavera You are not alone. We all have had those moments too. It's fun to make mistakes. Do not hesitate to throw open a thread if you need help. That's what keeps the community alive. I ask questions all the time:)
Below an approach creating 3 tables.
- One table with duplicate records
- One table will all IDs with duplicate records
- One table with all IDs with no duplicate records
data have;
input ID (Epi_start Epi_end) (:date9.) Epi_Number;
format Epi_start Epi_end date9.;
cards;
1 21-Aug-10 25-Aug-10 1
2 27-Aug-10 10-Sep-10 1
3 13-Sep-10 18-Sep-10 1
3 01-Oct-10 11-Oct-10 2
3 14-Oct-10 19-Oct-10 3
3 15-Dec-12 18-Dec-10 1
4 18-Oct-10 22-Oct-10 1
4 18-Oct-10 22-Oct-10 2
5 19-Oct-10 29-Oct-10 1
5 13-Dec-10 01-Jan-11 2
5 13-Dec-10 01-Jan-11 3
5 16-Feb-11 28-Feb-11 4
6 01-Dec-10 09-Dec-10 1
7 11-Dec-10 17-Dec-10 1
7 15-Oct-11 22-Oct-11 1
7 01-Dec-11 10-Dec-11 2
7 20-Jun-12 29-Jun-12 1
7 29-Jul-12 08-Aug-12 2
7 07-Sep-12 16-Sep-12 3
8 19-Dec-10 25-Dec-10 1
8 19-Dec-10 25-Dec-10 2
8 19-Dec-10 25-Dec-10 3
8 17-Jan-13 25-Jan-13 1
9 01-Mar-11 10-Mar-11 1
9 25-Mar-11 30-Mar-11 2
;
proc sort data=have out=dups NOUNIQUEKEY;
by id Epi_start Epi_end;
run;
data noDupIDs DupIDs;
merge have dups(in=ind keep=id);
by id;
if ind then output DupIDs;
else output noDupIDs;
run;
proc print data=DupIDs;
run;
Very nice. Thanks. I did not know about nouniquekey! Can you please kindly explain the line:
merge have dups(in=ind keep=id);
Thank you.
@Primavera wrote:
Very nice. Thanks. I did not know about nouniquekey! Can you please kindly explain the line:
merge have dups(in=ind keep=id);
Thank you.
The dups table contains records with duplicate keys {id, Epi_start, Epi_end}. By merging the dups table back with the source table over ID only (by id;) we can determine which ID's have duplicates and which one not (so also the rows which are not duplicates themselves).
We only need the ID from table dups for this merge, we need some indicator per row if a specific row matches over id or not - that's what the IN data set option gives us. It allows us to define a variable (ind) which becomes 1 if a row matches and 0 if it doesn't.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: