BookmarkSubscribeRSS Feed
rosejolly
Fluorite | Level 6

Hi All,

I want to merge 2 dataset using proc sql. Before merging I want to remove duplicates in one dataset and then merge . Can this be done under single proc statement in proc sql.

5 REPLIES 5
SASKiwi
PROC Star

What do you mean by duplicates? Rows where all columns' values are the same or where certain key columns have the same values? SQL isn't very good at removing duplicate rows / key columns and there are better methods to do this.

rosejolly
Fluorite | Level 6

Duplicates here I meant is certain key columns have the same values. Thanks for the reply

Kurt_Bremser
Super User

The blueprint is

proc sql;
create table want as
  select
    t1.*,
    t2.var,
    t2.varx
  from have1 t1 left join (
    select distinct id, var, varx
    from have2
  ) t2
  on t1.id = t2.id
;
quit;

To really be of help here, we need to see examples for your data and what you expect as a resulting table.

Post your incoming datasets as data steps with datalines, so we do not have to make guesses about your data.

rosejolly
Fluorite | Level 6

Thanks a lot, I think this will be helping out

Kurt_Bremser
Super User

Keep in mind that, as always, the task defines the tool. Depending on the layout, size and contents of your data, other means than SQL might be more efficient, or it might even be that SQL can't do what you want (SQL only knows DISTINCT, but PROC SORT makes a difference between NODUPREC and NODUPKEY).

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1234 views
  • 1 like
  • 3 in conversation