DATA Step, Macro, Functions and more

How does one-to-many merge behave?

Reply
PROC Star
Posts: 1,760

How does one-to-many merge behave?

I always thought that when a variable is present in both tables, the latest (by order of appearance in the merge statement) table's value overwrites the other values in a merge statement. Not so it seems:



[pre]
proc sort data=SASHELP.CLASS out=TEST; * Build TEST table ;
by AGE; * WEIGHT is only populated for ;
run; * the last record of each AGE group ;
data TEST;
set TEST;
by AGE;
if not last.AGE then WEIGHT=.;
run;




data RESULT; * Merge one: one-to-many merge fails ;
merge TEST
TEST(keep=AGE WEIGHT where=(WEIGHT ne .));
by AGE;
run; * Second table values written to first group record only;




data RESULT; * Merge two: one-to-many merge succeeds ;
merge TEST(drop=WEIGHT)
TEST(keep=AGE WEIGHT where=(WEIGHT ne .));
by AGE;
run; * Second table values written to all records;
[/pre]

Can anyone explain this?
Frequent Contributor
Posts: 102

Re: How does one-to-many merge behave?

This is because the value that is kept is the last one read into the data vector. That is only partially reflected by the order of the merged datasets. Merge is not a cartesian join. If you want that use PROC SQL.

Take age 12 in your data for example,

First the James record is read from the first dataset with a missing wieght.
The second dataset has a matching value of Age, so it is read in with a wieght of 128.
The row is written.
Next the Jane record is read from the first table with its missing value for wieght, this overwrites the value of 128.
There is not another record for Age 12 in the second table, so nothing is read.
The row is written.
Next the John record is read from the first table with its missing value for wieght.
There is not another record for Age 12 in the second table, so nothing is read.
The row is written.
And so on.

Your solution of adding the drop statement prevents the first table from overwritting the value wieght, so the value from the second table is retain for all of the rows.

Clear as Mud?

Message was edited by: Curtis Mack Message was edited by: Curtis Mack
PROC Star
Posts: 1,760

Re: How does one-to-many merge behave?

Posted in reply to CurtisMack
Thanks Curtis.

Rows are read only once (if applicable) and retained (and potentially over-written), not read over and over as SQL does.

It makes perfect sense now.

A cartesian product is probably more useful (as well as more resource-intensive) than this behaviour, but as long as we are aware of what takes place ...
Super Contributor
Posts: 273

Re: How does one-to-many merge behave?

Chris,

Sas institute produce an interresting and short text about this question

see rule 1 on page 1 in

The fundamentals of merge

support.sas.com/techsup/technote/ts705.pdf

HTH
Andre
PROC Star
Posts: 1,760

Re: How does one-to-many merge behave?

Occasional Contributor
Posts: 16

Re: How does one-to-many merge behave?

Great document Andre .

Thanx for Sharing It :-)
Ask a Question
Discussion stats
  • 5 replies
  • 187 views
  • 0 likes
  • 4 in conversation