Solved: How to find all the records of certain ID with a certain characteristi... - Page 3

novinosrin · Posted 09-22-2020 03:58 PM

Hi @Primavera You are not alone. We all have had those moments too. It's fun to make mistakes. Do not hesitate to throw open a thread if you need help. That's what keeps the community alive. I ask questions all the time:)

Reeza · Posted 01-22-2020 06:07 PM

You don't to create the HAVE table, that's your input data set of 300,000 records. And 300K records is trivial for SAS....very, very trivial 🙂

Patrick · Posted 01-22-2020 06:01 PM

Below an approach creating 3 tables.

- One table with duplicate records

- One table will all IDs with duplicate records

- One table with all IDs with no duplicate records

data have;
  input ID  (Epi_start  Epi_end) (:date9.)  Epi_Number;
  format Epi_start  Epi_end date9.;
  cards;
1 21-Aug-10 25-Aug-10 1
2 27-Aug-10 10-Sep-10 1
3 13-Sep-10 18-Sep-10 1
3 01-Oct-10 11-Oct-10 2
3 14-Oct-10 19-Oct-10 3
3 15-Dec-12 18-Dec-10 1
4 18-Oct-10 22-Oct-10 1
4 18-Oct-10 22-Oct-10 2
5 19-Oct-10 29-Oct-10 1
5 13-Dec-10 01-Jan-11 2
5 13-Dec-10 01-Jan-11 3
5 16-Feb-11 28-Feb-11 4
6 01-Dec-10 09-Dec-10 1
7 11-Dec-10 17-Dec-10 1
7 15-Oct-11 22-Oct-11 1
7 01-Dec-11 10-Dec-11 2
7 20-Jun-12 29-Jun-12 1
7 29-Jul-12 08-Aug-12 2
7 07-Sep-12 16-Sep-12 3
8 19-Dec-10 25-Dec-10 1
8 19-Dec-10 25-Dec-10 2
8 19-Dec-10 25-Dec-10 3
8 17-Jan-13 25-Jan-13 1
9 01-Mar-11 10-Mar-11 1
9 25-Mar-11 30-Mar-11 2
;

proc sort data=have out=dups NOUNIQUEKEY;
  by id Epi_start  Epi_end;
run;

data noDupIDs DupIDs;
  merge have dups(in=ind keep=id);
  by id;
  if ind then output DupIDs;
  else output noDupIDs;
run;

proc print data=DupIDs;
run;

Primavera · Posted 01-23-2020 03:52 PM

Very nice. Thanks. I did not know about nouniquekey! Can you please kindly explain the line:

merge have dups(in=ind keep=id);

Thank you.

Patrick · Posted 01-23-2020 06:22 PM

@Primavera wrote:

Very nice. Thanks. I did not know about nouniquekey! Can you please kindly explain the line:
merge have dups(in=ind keep=id);
Thank you.

The dups table contains records with duplicate keys {id, Epi_start, Epi_end}. By merging the dups table back with the source table over ID only (by id;) we can determine which ID's have duplicates and which one not (so also the rows which are not duplicates themselves).

We only need the ID from table dups for this merge, we need some indicator per row if a specific row matches over id or not - that's what the IN data set option gives us. It allows us to define a variable (ind) which becomes 1 if a row matches and 0 if it doesn't.

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Registration is open

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Re: How to find all the records of certain ID with a certain characteristics

Registration is open

SAS Training: Just a Click Away