Hello team,
How can we do Left outer join in SAS without using sql?
Can someone help me with this?
Regards,
blueblue
Sure. Just do what we did before there was Proc SQL; use a Merge.
Here, I'm going to do essentially a left join. I'm joining in phone_number, but I want to keep the individual even if I can't get a phone number for them. Regrettably, I couldn't get a phone number for Joe Biden and Jeff Bezos for this example. 🙂
DATA Have;
LENGTH ID $4 Name $64 Address $64 City $64;
INFILE DATALINES MISSOVER DSD DLM='09'x;
INPUT ID $ Name $ Address $ City $;
DATALINES;
2418 Barbara Ann 1515 Android Lane Milwaulkee
2724 Walt Disney 48 IBM Way Anaheim
5429 John Q Public 24 Apple Str Des Moines
9527 Joe Biden 1600 Pennsylvania Avenue NW Washington D.C.
9610 Jeff Bezos 1 Blue Origin St Kent
;
RUN;
DATA Have2;
LENGTH ID $4 Phone_Number $12;
INFILE DATALINES MISSOVER DSD DLM='09'x;
INPUT ID $ Phone_Number $;
DATALINES;
2418 555-222-3333
2724 888-444-111
5429 999-876-5309
;
RUN;
PROC SORT DATA=Have;
BY ID;
RUN;
PROC SORT DATA=Have2;
BY ID;
RUN;
DATA Want;
MERGE Have (IN=Have)
Have2 (IN=Have2);
BY ID;
RUN;
Results below. Note how individuals without a phone number are still present in the results.
Jim
Sure. Just do what we did before there was Proc SQL; use a Merge.
Here, I'm going to do essentially a left join. I'm joining in phone_number, but I want to keep the individual even if I can't get a phone number for them. Regrettably, I couldn't get a phone number for Joe Biden and Jeff Bezos for this example. 🙂
DATA Have;
LENGTH ID $4 Name $64 Address $64 City $64;
INFILE DATALINES MISSOVER DSD DLM='09'x;
INPUT ID $ Name $ Address $ City $;
DATALINES;
2418 Barbara Ann 1515 Android Lane Milwaulkee
2724 Walt Disney 48 IBM Way Anaheim
5429 John Q Public 24 Apple Str Des Moines
9527 Joe Biden 1600 Pennsylvania Avenue NW Washington D.C.
9610 Jeff Bezos 1 Blue Origin St Kent
;
RUN;
DATA Have2;
LENGTH ID $4 Phone_Number $12;
INFILE DATALINES MISSOVER DSD DLM='09'x;
INPUT ID $ Phone_Number $;
DATALINES;
2418 555-222-3333
2724 888-444-111
5429 999-876-5309
;
RUN;
PROC SORT DATA=Have;
BY ID;
RUN;
PROC SORT DATA=Have2;
BY ID;
RUN;
DATA Want;
MERGE Have (IN=Have)
Have2 (IN=Have2);
BY ID;
RUN;
Results below. Note how individuals without a phone number are still present in the results.
Jim
This is a great input.
Thanks for this response:
Does this
MERGE Have (IN=Have)
Have2 (IN=Have2);
Mean all the rows in the output, regardless whether there is a match or not.
What is the syntax if you want to bring all the cases?
Regards,
blueblue
Please do study the documentation of the MERGE statement.
In short, MERGE reads observations side-by-side. HOW it does that is controlled by an (optional) BY statement; subsequent conditions can be used to keep observations selectively.
Hello,
I have studied the documentation. It was explained not very easy to grasp. I found a white paper in SAS community which helped me to understand the topic. Aslo, there is a forum member named Jimbourbar explained it very well.
so I have the concept now and tomorrow, I have to test the syntax.
Thanks for all.
blueblue
@GN0001 wrote:
Does this
MERGE Have (IN=Have) Have2 (IN=Have2);
Mean all the rows in the output, regardless whether there is a match or not.
Yes. Since I'm not using any conditions to limit the records in the output Data set, all rows are kept.
Jim
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.