DATA Step, Macro, Functions and more

Merging Issue

Reply
New User
Posts: 1

Merging Issue

So, I'm familiar with merges in SAS, and haven't had issues before, but I noticed an issue today that has never been an issue before.
For some reason the actual merging of observations is working properly in more complex data sets, however it only lists the variable values from one of the data sets (e.g. it doesn't overwrite missing values).

For instance, I wrote up this simple program:

    data dataset1;
    input id var1 var2 var3 var4;
    cards;
    1 . . 2 2
    2 . . 2 2
    3 . . 2 2
    4 . . 2 2
    5 . . 2 2
    6 . . 2 2
    7 . . 2 2
    8 . . 2 2
    9 . 2 . 2
    10 1 . . .
    ;
    data dataset2;
    input id var1 var2 var3 var4;
    cards;
    1  2 2 . .
    2  2 2 . .
    3  2 2 . .
    4  2 2 . .
    5  2 2 . .
    6  2 2 . .
    7  2 2 . .
    8  2 2 . .
    10 . 1 . .
    ;
    
    
    data dataset3;
    merge dataset1 dataset2;
    by id;
    run;

This should yield the following:

    id var1 var2 var3 var4
    1   2    2    2    2
    2   2    2    2    2
    3   2    2    2    2
    4   2    2    2    2
    5   2    2    2    2
    6   2    2    2    2
    7   2    2    2    2
    8   2    2    2    2
    9   .    2    .    2
    10  1    1    .    .
   

but instead, I get this:
 

    id var1 var2 var3 var4
        1   2    2    .    .
        2   2    2    .    .
        3   2    2    .    .
        4   2    2    .    .
        5   2    2    .    .
        6   2    2    .    .
        7   2    2    .    .
        8   2    2    .    .
        9   .    2    .    2
        10  .    1    .    .

So, it's as if the merge is merging the observations and then just displaying the second data set's values.

I've tried to figure out the issue (I have a feeling it's something very basic I've just looked over), but I've no idea what's happening, since I've never come across the issue before.

Anyone know what's going wrong?

Thanks for any help.

Super User
Posts: 5,497

Re: Merging Issue

Posted in reply to Username11

Sorry to say, but MERGE is working the way it is supposed to.  Whatever value it reads in replaces what was there before.  Even if a missing value is read in, it replaces what was there before.

 

Since you have only one observation per ID in both data sets, try experimenting with it this way:

 

data dataset3;

update dataset1 dataset2;

by id;

run;

 

UPDATE has certain restrictions as to when it can be used, but it ignores missing values.

Contributor
Posts: 43

Re: Merging Issue

Posted in reply to Username11

Obs 1-8 in both datasets are duplicate records,  you will always get quirky results.  You need unique records on your by statement to get a clean merge.  The left most datasets will update the matching variables when the by statement records match, and create new rows when they don't.

 

 

Super User
Posts: 7,762

Re: Merging Issue

Posted in reply to Username11

Since you have a 1:1 relationship, and identically named variables in both datasets, the dataset named last in the merge statement will be the one that fills the PDV before the output.

To achieve what you want, you might also consider using proc sql and the SQL coalesce function.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 52

Re: Merging Issue

Posted in reply to Username11

In your code above, SAS is working as expected. As both the merging dataset has common variables and you have not put in any condition for merging(Alias) then SAS would overwrite the values already in PDS(came from the left dataset).


Cheers from India!

Manjeet
Ask a Question
Discussion stats
  • 4 replies
  • 225 views
  • 0 likes
  • 5 in conversation