BookmarkSubscribeRSS Feed
nirupama1
Fluorite | Level 6
data ex1;
input ID Score;
cards;
1 25
2 28
3 35
4 45
;
run;
data ex2;
input ID Score;
cards;
1 95
2 97
;
run;


data comb;
merge ex1 ex2;
by ID;
run;

 

I read that if there are common variables in both datasets,the value is overwitten by the value in the right dataset.

So, I expect output as

 

1 95

2 97

3 35

4 45

 

But I am getting output as

 

1 25

2 28

3 35

4 45

 

Can someone explain why?

 

Thanks,

Nirupama

 

6 REPLIES 6
novinosrin
Tourmaline | Level 20

The below is what i got when i ran your code

ID=1 Score=95

ID=2 Score=97

ID=3 Score=35

ID=4 Score=45

Reeza
Super User

It does work as intended, so here's a couple of possibilities:

 

1. You're looking at the wrong data set - ex1 not comb

2. Code didn't run and this is an old data set - check your log

3. Your data isn't as shown for some reason or the code was mistyped. 

 

Either way, if you can't figure it out I suggest posting the log from your run. 

 


@nirupama1 wrote:
data ex1;
input ID Score;
cards;
1 25
2 28
3 35
4 45
;
run;
data ex2;
input ID Score;
cards;
1 95
2 97
;
run;


data comb;
merge ex1 ex2;
by ID;
run;

 

I read that if there are common variables in both datasets,the value is overwitten by the value in the right dataset.

So, I expect output as

 

1 95

2 97

3 35

4 45

 

But I am getting output as

 

1 25

2 28

3 35

4 45

 

Can someone explain why?

 

Thanks,

Nirupama

 


 

singhsahab
Lapis Lazuli | Level 10

Hi Nirupama,

 

as per your code we are getting output as you expected .

 

ID
 
Score
 
 
195  
296  
335  
445 

 

If you wanted to stop overwrite , you should use proc SQL like below. 

 

proc SQL;
select coalesce(a.id,b.id) as id , coalesce(a.Score,b.Score) as score from ex1 as a left join ex2 as b on
a.id = b.id;
quit;

 

This will return output as :

 

id score
125
228
335
445

 

Thanks,

 

novinosrin
Tourmaline | Level 20

While I'm not so creative like you are to offer so fancy alternatives, I;m not sure the creativitiy requires heavy usage of coalesce and beyond assignments unless there are missing score values but not wanting to overwrite in itself is defeated by coalesce even though it's missing as missing can be valid. So,. It can be as simple as just select a.* 

 


proc SQL;
create table want as 
select a.* 
from ex1 as a left join ex2 as b 
on a.id = b.id;
quit;

 

 


@singhsahab wrote:

 

If you wanted to stop overwrite , you should use proc SQL like below. 

 

proc SQL;
select coalesce(a.id,b.id) as id , coalesce(a.Score,b.Score) as score from ex1 as a left join ex2 as b on
a.id = b.id;
quit;

 

This will return output as :

 

id score
1 25
2 28
3 35
4 45

 

Thanks,

 


 

hashman
Ammonite | Level 13

@nirupama1:

Your code actually works as expected.

However, provided that the ID key-values in EX1 are unique (as is the case with your sample), the tool SAS intends for this kind of job is not MERGE but UPDATE:

data comb ;       
  update ex1 ex2 ;
  by ID ;         
run ;             

One of the practical rules Ian Whitlock has formulated for MERGE is that, in order to avoid unexpected results of overwriting, only the BY variables in the data sets being merged should be the same.

 

Paul D.

 

kiranv_
Rhodochrosite | Level 12

you can use sql update too 

proc sql;
update ex1 a
set score = (select  b.score from ex2 b
                        
                          where a.id = b.id)
where id in (select id from ex2);

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1713 views
  • 0 likes
  • 6 in conversation