Help using Base SAS procedures

Proc SQL - inner join and left join in one step

Reply
Frequent Contributor
Posts: 101

Proc SQL - inner join and left join in one step

Hi,

Can I have inner join and left join in one step in PROC SQL, or I've to do in 2 steps or do with subquery?

Here is an example.

1. I want ID in both tables A and B (so just ID=1 and 2), so it's inner join (in data step it's in A and B)

2. I then want to link the cases to table C to get their breakfast category. Since it's likely the category in C is not complete, I'd need to use left join to keep records (i.e. mushrrom) from step 1 (in data step it's in A).

data a; input id x1;

datalines;

1 11

2 22

;

data b; input id x2 breakfast $;

datalines;

1 1 apple

2 1 carrot

2 1 mushroom

3 1 pepper

;

data c; input breakfast $ category $;

datalines;

apple  fruit

orange fruit

carrot  vegetable

pepper vegetable

;

Data wanted:

ID  breakfast    category

1    apple           fruit

2    carrot          vegetable

2    mushroom    .

SAS code in 2 steps:

proc sql; create table d as

    select  a.id, b.breakfast   

    from a as a, b as b

    where a.id=b.id;

    select  d.id, d.breakfast, c.category   

    from d as d left join c as c

    on d.breakfast=c.breakfast;

quit;

proc print data=want noobs; run;

A related question is can I do inner join (in A and B) and in A but not C in one step? I read that EXCEPT can do the job, but when I copied the example code, it didn't seem right. For the fruit example above, the data set should look like

  ID  breakfast    category

   2   mushroom  .

Many thanks in advance.

Super User
Posts: 5,260

Re: Proc SQL - inner join and left join in one step

It seems that you confuse left join with except logic, which do you want?

And yes, you can nestle joins in the same SQL step, but I recommend that you you explicit join syntax (i.e. a inner join b on...) to avoid unexpected results.

Data never sleeps
Frequent Contributor
Posts: 127

Re: Proc SQL - inner join and left join in one step

Hi,

Would the following piece of code do what you are expecting ?

proc sql;

create table d as

    select  a.id

    ,          b.breakfast

    ,          c.category

    from a

    inner join b

         on a.id = b.id

    left join c

         on b.breakfast = c.breakfast;

quit;

data e;

     set d (where=(missing(category)));

run;

Kind regards,

Florent

Occasional Contributor
Posts: 14

Re: Proc SQL - inner join and left join in one step

proc sql; create table res as ( select x.id,x.breakfast,c.category from (select a.id,  b.breakfast from a join b on a.id=b.id) x left join c on x.breakfast=c.breakfast ); run;

Frequent Contributor
Posts: 101

Re: Proc SQL - inner join and left join in one step

Thanks NagendraKumarK for the subquery code. And thanks especially Florent for the code, which is exactly what I want, for my first question.

Just a question though, if it possible to do If A and B on ID, and If A but not C on breakfast in one code? I see Florent's code in two steps. Possibly in one step.

LinusH, sorry my questions weren't phrased clearly. It's two part questions. First how do I do inner join and left join in one code (I've never seen sample code as Florent's, so it really helped). Second, how do I do inner join (If A and B) and If A but not C in one code. Florent's two-step code is great. But I'm wondering it it can be done in one step.

Frequent Contributor
Posts: 127

Re: Proc SQL - inner join and left join in one step

The first step (table d) contains everything which is in A and B, and potentially in C. When there is no match with the table C then the content of the category field will be empty.

If you want to have table e created at the same time as table d (so in one step), then you should use make a dastep using the merge statement and play with the inA, inB, inC variable you can create at that time. The disadvantage of making a datastep instead of a proc sql is that it requires you firstly sort tables a, b and c in the same order... so in the end you would have 3 sort statements + 1 datastep. There is no way to create two datasets at a time with a "create table" statement of the proc sql.

I hope it answers to your questions.

Super User
Posts: 5,260

Re: Proc SQL - inner join and left join in one step

Second question, just combine an inner join with a sub-query (not in() ).

Data never sleeps
Frequent Contributor
Posts: 101

Re: Proc SQL - inner join and left join in one step

Thanks for the replies. I've a related question still about doing various merges, hoping to find NO NEED to do sub-queries. The example above is a simpler one as it is always using A to inner join with B, and to left join with C. But the example below doesn't always start with A. I get the feeling that sub-queres are the only way to go in SQL, or use data step. Just want to confirm.

data a; input patient_id hospital_id los;

datalines;

1 100 1

1 102 2

2 100 4

3 101 5

3 103 10

;

data b; input patient_id hospital_id wtcases;

datalines;

1 100 1.5

1 102 2.2

2 100 1.1

3 103 4.1

;

data c; input hospital_id postcode $;

datalines;

100 58823

101 45000

102 45001

;

data d; input postcode $ area;

datalines;

58823  51

45000  52

99999  99

;

*The code below actualy wouldn't run uless I take out what's after SELECT, and use just SELECT *. Then I see it's outputting all combinations and realized the FROM statement isn't doing what I thought SQL could do.

Proc sql; create table want as

    select a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

    from a inner join b on a.patient_id=b.patient_id and a.hospital_id=b.hospital_id,

         b left join c on b.hospital_id=c.hospital_id,

         c left join d on c.postcode=d.postcode;

quit;

proc print data=a noobs; run;

proc print data=b noobs; run;

proc print data=c noobs; run;

proc print data=d noobs; run;

proc print data=want; run;.

Frequent Contributor
Posts: 127

Re: Proc SQL - inner join and left join in one step

When using the "inner join"/"left join"/"right join" syntax, you cannot put a comma after the join condition(s). Furthermore, you don't have to repeat the table names which have already been used in a join. With your code, it gives:

Proc sql;

    create table want as

    select a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

    from a

    inner join b

           on a.patient_id = b.patient_id

         and a.hospital_id = b.hospital_id

    left join c

           on b.hospital_id = c.hospital_id

    left join d

           on c.postcode=d.postcode;

quit;

proc print data=a noobs; run;

proc print data=b noobs; run;

proc print data=c noobs; run;

proc print data=d noobs; run;

proc print data=want; run;

Frequent Contributor
Posts: 101

Re: Proc SQL - inner join and left join in one step

Thanks a million. SQL is indeed very powerful. The code worked for a large part. What it didn't work is the following. If you have answers for it, let me know. Otherwise I'll change to answered.

1. I need to change from

     SELECT a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

     to

     SELECT *

     otherwise it's giving an error message "ERROR: Column id could not be found in the table/view identified with the correlation name A" even though there is nothing wrong.

2. It's outputting the following cases, I thought inner join A and B by patient ID and Hospital ID would get rid of patient ID=3  all together. But it got rid of the one from hospital 101 and kept the one from hospital 103.

patient_idhospital_idloswtcasespostcodearea
3103104.1.
110222.245001.
210041.15882351
110011.55882351

Anyway thanks very much for the reply. I didn't know I could use SQL this way. Let me know if there are answers to the above questions. Otherwise I'll close this discussion.

Frequent Contributor
Posts: 127

Re: Proc SQL - inner join and left join in one step

Hi,


Concerning the first point, it is clear that the "ID" is not available in table A. It should be replaced by "patient_id".

For the second bullet point, I do not agree with the fact that the line concerning hospital 103 and patient 3 should be excluded from the table. If you look at tables A and B, you will find this patient-hospital combination in both of them. Thus this is logical to find it back in your final table when using inner joins.

I hope it helps.

Ask a Question
Discussion stats
  • 10 replies
  • 1856 views
  • 1 like
  • 4 in conversation