Hi all, I want to know how can join 2 tables to one tables by using left join. I have my patient cohort and want to join 2 other datasets to get more variables. The number of observations will remain the same just the variable number will increase.
My code looks like this.. but it is incomplete. Is there a way I can use multiple left joins ?
Is there a way to combine both the dataset ?
PS. b.code in both the steps have different values for code, one is coming from dataset1 and the other is coming from dataset2.
Thanks
/*LEFT JOIN TO GET CODES FROM DATASET1 */
proc sql ;
create table _01_esrd_dial_codes as
select distinct a.*, b.code
from _01_pt as a
left join DATASET1 as b
on a.enrolid = b.enrolid ;
quit ;
/*LEFT JOIN TO GET CODES FROM DATASET2 */
proc sql ;
create table _01_JOIN2 as
select distinct a.*, b.code
from _01_pt as a
left join DATASET2 as b
on a.enrolid = b.enrolid ;
quit ;
Something like:
proc sql ; create table _01_esrd_dial_codes as select distinct a.*, b.code,c.code as c_code from _01_pt as a left join dataset1 as b on a.enrolid = b.enrolid left join dataset2 as c on a.enrolid=c.enrolid; quit;
You can keep adding left or right of full joins on as much as you like up to about 20 something joins.
Something like:
proc sql ; create table _01_esrd_dial_codes as select distinct a.*, b.code,c.code as c_code from _01_pt as a left join dataset1 as b on a.enrolid = b.enrolid left join dataset2 as c on a.enrolid=c.enrolid; quit;
You can keep adding left or right of full joins on as much as you like up to about 20 something joins.
This works but i also want to stack b.code and c_code. Is that possible in the same step as the left join
Yes. I would do it like this, just clearer for me:
proc sql ; create table _01_esrd_dial_codes as select distinct a.*,
b.code from _01_pt as a left join (select enrolid,
code
from dataset1 union all
select enrolid,
code
from dataset2) b
on a.enrolid=b.enrolid; quit;
Note that this assumes the same type in both dataset1 and 2.
If your datasets are large, then there is a good case for not using SQL at all. SAS implementation of SQL (from what I recal) is that it is all done be loading and writing into intermediary tables, so when data gets larger then it takes longer to process.
The question raised here was "how do I do xyz", not should I be doing "xyz". IN case of big datasets I would set both sets of data together with a category defining which dataset of orign, then merge the two sets of data using a datastep merge, this would be the most efficient method to get the final result.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.