10-31-2013 08:55 AM
I am trying to figure out if there is way to get same results as operating inner join on three tables in SQL by means of match-merge in DATA step?
As long as data set is sorted on BY variable, and subsetting IF condition in used on correct variable, it is straightforward to operate inner join for two data sets in DATA step. However is it possible to add third data set as well? What effect will it have to introduce two variables in BY statement of MERGE?
10-31-2013 09:13 AM
"As long as data set is sorted on BY variable, and subsetting IF condition in used on correct variable, it is straightforward to operate inner join for two data sets in DATA step"
Data Step merge is running on completely different mechanism From Proc SQL. Resultwise is only comparable if it is "one to one" or "one to many" join or merge, datastep merge will give you something unexpected if it is 'many to many' merge.
Back to your question: "is it possible to add third data set as well?", the answer is 'YES', it is possible to add more than 3 tables as long as they share the 'by' variable, which again proc sql join does not have this restriction.
Finally, is there anything comparable to Proc SQL join in Data Step regime? Yes, check docs on Hash() method.
10-31-2013 09:33 AM
If you are going to be doing a lot of PROC SQL programming, then I would recommend taking a look at Kirk Lafler's book:
He just came out with the second edition of this classic text.
Best of luck in all your SAS endeavors!
(aka Michael A. Raithel)
10-31-2013 10:06 AM
To expand on Hai.kuo's response:
Using the DATA step MERGE:
1. You can use two or more variables in the BY statement, and all of the variables must match to merge the data together.
2. You can merge three datasets together, but you should make sure that at least two of them have only one row per BY variables, ie. one-to-one match. Make sure to check your saslog. SAS will give a warning if "more than one data set have multiples of the BY variable". When this happens, the results are not at all what you would expect or want!
3. Definitely learn to use PROC SQL to do more complex merges (JOINs) of data. See MMMMIIIIKKKKEEEE's refernce to Kirk Lafler's book. Make some simple test files with multiple rows of your BY variables and learn what the results look like, and compare to what the DATA MERGE does.