DATA Step, Macro, Functions and more

DATA STEP merge on composite key?

Accepted Solution Solved
Reply
Contributor
Posts: 71
Accepted Solution

DATA STEP merge on composite key?

How do I merge datasets on more than one key?

 

Something equivalent to:

 

PROC SQL;
  SELECT *
  FROM table_a JOIN table_b ON (table_a.col1 = table_b.col1 AND table_a.col2 = table_b.col2);
QUIT;

Accepted Solutions
Solution
‎04-04-2018 12:18 PM
Super User
Posts: 6,537

Re: DATA STEP merge on composite key?

Posted in reply to tomcmacdonald

The equivalent would be along these lines.

 

The data sets must be in order first.  If not, sort them:

 

proc sort data=table_a;

by col1 col2;

run;

proc sort data=table_b;

by col1 col2;

run;

 

Then MERGE:

 

data want;

merge table_a (in=found_in_table_a)

table_b (in=found_in_table_b);

by col1 col2;

run;

 

You can also add to the DATA step, utilizing the temporary flags (in this case named FOUND_IN_TABLE_A and FOUND_IN_TABLE_B).  Those flags could subset the records to get the equivalent of left, right, or inner joins.

 

One thing that is different in a DATA step vs. SQL:  the variables COL1 and COL2 should have same attributes in both incoming data sets.  Results are not guaranteed, for example, if COL1 has a different length in TABLE_A vs. in TABLE_B.

View solution in original post


All Replies
Solution
‎04-04-2018 12:18 PM
Super User
Posts: 6,537

Re: DATA STEP merge on composite key?

Posted in reply to tomcmacdonald

The equivalent would be along these lines.

 

The data sets must be in order first.  If not, sort them:

 

proc sort data=table_a;

by col1 col2;

run;

proc sort data=table_b;

by col1 col2;

run;

 

Then MERGE:

 

data want;

merge table_a (in=found_in_table_a)

table_b (in=found_in_table_b);

by col1 col2;

run;

 

You can also add to the DATA step, utilizing the temporary flags (in this case named FOUND_IN_TABLE_A and FOUND_IN_TABLE_B).  Those flags could subset the records to get the equivalent of left, right, or inner joins.

 

One thing that is different in a DATA step vs. SQL:  the variables COL1 and COL2 should have same attributes in both incoming data sets.  Results are not guaranteed, for example, if COL1 has a different length in TABLE_A vs. in TABLE_B.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 80 views
  • 0 likes
  • 2 in conversation