Removing duplicates

rosejolly · Posted 11-03-2021 12:53 AM

Hi All,

I want to merge 2 dataset using proc sql. Before merging I want to remove duplicates in one dataset and then merge . Can this be done under single proc statement in proc sql.

SASKiwi · Posted 11-03-2021 01:16 AM

What do you mean by duplicates? Rows where all columns' values are the same or where certain key columns have the same values? SQL isn't very good at removing duplicate rows / key columns and there are better methods to do this.

rosejolly · Posted 11-03-2021 06:27 AM

Duplicates here I meant is certain key columns have the same values. Thanks for the reply

Kurt_Bremser · Posted 11-03-2021 07:34 AM

The blueprint is

proc sql;
create table want as
  select
    t1.*,
    t2.var,
    t2.varx
  from have1 t1 left join (
    select distinct id, var, varx
    from have2
  ) t2
  on t1.id = t2.id
;
quit;

To really be of help here, we need to see examples for your data and what you expect as a resulting table.

Post your incoming datasets as data steps with datalines, so we do not have to make guesses about your data.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

rosejolly · Posted 11-03-2021 07:49 AM

Thanks a lot, I think this will be helping out

Kurt_Bremser · Posted 11-03-2021 08:43 AM

Keep in mind that, as always, the task defines the tool. Depending on the layout, size and contents of your data, other means than SQL might be more efficient, or it might even be that SQL can't do what you want (SQL only knows DISTINCT, but PROC SORT makes a difference between NODUPREC and NODUPKEY).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Re: Removing duplicates

Registration is open

SAS Training: Just a Click Away