BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pinkyc
Calcite | Level 5


Input data named pat_detail

Pat_ID     Dis_Date     Linker               Pat_Age     Zip_code

A11          03/28/12     A11 03/28/12     39               000

A11          03/28/12     A11 03/28/12     39               000    

A12          04/19/13     A12 04/19/13     43               111

A13          05/20/14     A13 05/20/14     44               222

I have a table with the following variables and several more observations but some of them are duplicates as shown in example for Pat_ID A11.  I did this to check for the Pat_ID's that had duplicates and if so, how many.

proc sql;

create table duplicate as

select distinct PAT_ID, dis_date, linker, count(*) as count

from pat_detail

group by recip_id, discharge_date

Having count>=2;

quit;

duplicate table data:

Pat_ID    Dis_Date     Linker          Count

A11       03/28/12     A11 03/28/12     2

How do I delete the observations in Pat_detail that are duplicates based on checking Pat_ID, dis_date and linker?

1 ACCEPTED SOLUTION

Accepted Solutions
MaikH_Schutze
Quartz | Level 8

Hi,

The NODUPKEY option in PROC SORT should do the trick. For example:

proc sort data=pat_detail out=pat_detail_new nodupkey;

by pat_ID dis_date linker;

run;

The OUT= option creates a new data set with only the unique combinations of the BY variables.

Hope this helps,

M.

View solution in original post

5 REPLIES 5
stat_sas
Ammonite | Level 13

proc sql;

select distinct * from pat_detail;

quit;

MaikH_Schutze
Quartz | Level 8

Hi,

The NODUPKEY option in PROC SORT should do the trick. For example:

proc sort data=pat_detail out=pat_detail_new nodupkey;

by pat_ID dis_date linker;

run;

The OUT= option creates a new data set with only the unique combinations of the BY variables.

Hope this helps,

M.

Patrick
Opal | Level 21

What if you have for some data quality issues duplicate keys for "Pat_ID, dis_date and linker" but then also different values in these records for "Pat_Age"? To be on the safe side I would only remove duplicate records (all variables having the same value).

data have;

  infile datalines truncover;

  input Pat_ID $ Dis_Date:mmddyy. Linker $ Pat_Age:mmddyy. Zip_code $;

  format Dis_Date date9.;

  datalines;

A11 03/28/12 A11 03/28/12 39 000

A11 03/28/12 A11 03/28/12 39 000

A11 03/28/12 A11 03/28/12 49 000

A12 04/19/13 A12 04/19/13 43 111

A13 05/20/14 A13 05/20/14 44 222

;

run;

/* remove duplicates */

proc sort data=have out=want dupout=duplicates nodupkey;

  by _all_;

run;

/* DQ issues */

proc sql;

  create table DQ_issue as

    select * from want

    group by Pat_ID, dis_date, linker

    having count(*)>=2

  ;

quit;

Loko
Barite | Level 11

Hello,

Depending on your interest you may choose the appropriate calculation for the variables not included in the key.

data have;

input Pat_ID   $  Dis_Date $     Linker    & $12.     Pat_Age     Zip_code ;

datalines;

A11          03/28/12     A11 03/28/12     39               000

A11          03/28/12     A11 03/28/12     43               000   

A11          03/28/12     A11 03/28/12     44               001

A12          04/19/13     A12 04/19/13     43               111

A13          05/20/14     A13 05/20/14     44               222

;

proc sql;

select pat_ID, Dis_Date,  Linker, max(Pat_Age) , count(Zip_code)

from have

group by pat_ID, Dis_Date,  Linker

;

quit;

Tanmay
Calcite | Level 5

proc sort data=pat_detail out=pat_detail_new nodupkey;

by pat_ID dis_date linker;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 12679 views
  • 3 likes
  • 6 in conversation