SAS Procedures

Solph · Posted 01-15-2013 02:24 AM

Hi,

Can I have inner join and left join in one step in PROC SQL, or I've to do in 2 steps or do with subquery?

Here is an example.

1. I want ID in both tables A and B (so just ID=1 and 2), so it's inner join (in data step it's in A and B)

2. I then want to link the cases to table C to get their breakfast category. Since it's likely the category in C is not complete, I'd need to use left join to keep records (i.e. mushrrom) from step 1 (in data step it's in A).

data a; input id x1;

datalines;

1 11

2 22

;

data b; input id x2 breakfast $;

datalines;

1 1 apple

2 1 carrot

2 1 mushroom

3 1 pepper

;

data c; input breakfast $ category $;

datalines;

apple fruit

orange fruit

carrot vegetable

pepper vegetable

;

Data wanted:

ID breakfast category

1 apple fruit

2 carrot vegetable

2 mushroom .

SAS code in 2 steps:

proc sql; create table d as

select a.id, b.breakfast

from a as a, b as b

where a.id=b.id;

select d.id, d.breakfast, c.category

from d as d left join c as c

on d.breakfast=c.breakfast;

quit;

proc print data=want noobs; run;

A related question is can I do inner join (in A and B) and in A but not C in one step? I read that EXCEPT can do the job, but when I copied the example code, it didn't seem right. For the fruit example above, the data set should look like

ID breakfast category

2 mushroom .

Many thanks in advance.

LinusH · Posted 01-15-2013 02:38 AM

It seems that you confuse left join with except logic, which do you want?

And yes, you can nestle joins in the same SQL step, but I recommend that you you explicit join syntax (i.e. a inner join b on...) to avoid unexpected results.

Data never sleeps

Florent · Posted 01-15-2013 07:59 AM

Hi,

Would the following piece of code do what you are expecting ?

proc sql;

create table d as

select a.id

, b.breakfast

, c.category

from a

inner join b

on a.id = b.id

left join c

on b.breakfast = c.breakfast;

quit;

data e;

set d (where=(missing(category)));

run;

Kind regards,

Florent

NagendraKumarK · Posted 01-15-2013 11:06 AM

proc sql; create table res as ( select x.id,x.breakfast,c.category from (select a.id, b.breakfast from a join b on a.id=b.id) x left join c on x.breakfast=c.breakfast ); run;

Solph · Posted 01-15-2013 01:29 PM

Thanks NagendraKumarK for the subquery code. And thanks especially Florent for the code, which is exactly what I want, for my first question.

Just a question though, if it possible to do If A and B on ID, and If A but not C on breakfast in one code? I see Florent's code in two steps. Possibly in one step.

LinusH, sorry my questions weren't phrased clearly. It's two part questions. First how do I do inner join and left join in one code (I've never seen sample code as Florent's, so it really helped). Second, how do I do inner join (If A and B) and If A but not C in one code. Florent's two-step code is great. But I'm wondering it it can be done in one step.

Florent · Posted 01-15-2013 05:47 PM

The first step (table d) contains everything which is in A and B, and potentially in C. When there is no match with the table C then the content of the category field will be empty.

If you want to have table e created at the same time as table d (so in one step), then you should use make a dastep using the merge statement and play with the inA, inB, inC variable you can create at that time. The disadvantage of making a datastep instead of a proc sql is that it requires you firstly sort tables a, b and c in the same order... so in the end you would have 3 sort statements + 1 datastep. There is no way to create two datasets at a time with a "create table" statement of the proc sql.

I hope it answers to your questions.

LinusH · Posted 01-16-2013 06:40 AM

Second question, just combine an inner join with a sub-query (not in() ).

Data never sleeps

Solph · Posted 01-16-2013 02:59 PM

Thanks for the replies. I've a related question still about doing various merges, hoping to find NO NEED to do sub-queries. The example above is a simpler one as it is always using A to inner join with B, and to left join with C. But the example below doesn't always start with A. I get the feeling that sub-queres are the only way to go in SQL, or use data step. Just want to confirm.

data a; input patient_id hospital_id los;

datalines;

1 100 1

1 102 2

2 100 4

3 101 5

3 103 10

;

data b; input patient_id hospital_id wtcases;

datalines;

1 100 1.5

1 102 2.2

2 100 1.1

3 103 4.1

;

data c; input hospital_id postcode $;

datalines;

100 58823

101 45000

102 45001

;

data d; input postcode $ area;

datalines;

58823 51

45000 52

99999 99

;

*The code below actualy wouldn't run uless I take out what's after SELECT, and use just SELECT *. Then I see it's outputting all combinations and realized the FROM statement isn't doing what I thought SQL could do.

Proc sql; create table want as

select a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

from a inner join b on a.patient_id=b.patient_id and a.hospital_id=b.hospital_id,

b left join c on b.hospital_id=c.hospital_id,

c left join d on c.postcode=d.postcode;

quit;

proc print data=a noobs; run;

proc print data=b noobs; run;

proc print data=c noobs; run;

proc print data=d noobs; run;

proc print data=want; run;.

Florent · Posted 01-16-2013 04:39 PM

When using the "inner join"/"left join"/"right join" syntax, you cannot put a comma after the join condition(s). Furthermore, you don't have to repeat the table names which have already been used in a join. With your code, it gives:

Proc sql;

create table want as

select a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

from a

inner join b

on a.patient_id = b.patient_id

and a.hospital_id = b.hospital_id

left join c

on b.hospital_id = c.hospital_id

left join d

on c.postcode=d.postcode;

quit;

proc print data=a noobs; run;

proc print data=b noobs; run;

proc print data=c noobs; run;

proc print data=d noobs; run;

proc print data=want; run;

Solph · Posted 01-17-2013 09:19 PM

Thanks a million. SQL is indeed very powerful. The code worked for a large part. What it didn't work is the following. If you have answers for it, let me know. Otherwise I'll change to answered.

1. I need to change from

SELECT a.id, a.los, b.hospital_id, b.wtcases, c.postcode, d.area

to

SELECT *

otherwise it's giving an error message "ERROR: Column id could not be found in the table/view identified with the correlation name A" even though there is nothing wrong.

2. It's outputting the following cases, I thought inner join A and B by patient ID and Hospital ID would get rid of patient ID=3 all together. But it got rid of the one from hospital 101 and kept the one from hospital 103.

patient_id	hospital_id	los	wtcases	postcode	area
3	103	10	4.1		.
1	102	2	2.2	45001	.
2	100	4	1.1	58823	51
1	100	1	1.5	58823	51

Anyway thanks very much for the reply. I didn't know I could use SQL this way. Let me know if there are answers to the above questions. Otherwise I'll close this discussion.

Florent · Posted 01-18-2013 07:51 AM

Hi,

Concerning the first point, it is clear that the "ID" is not available in table A. It should be replaced by "patient_id".

For the second bullet point, I do not agree with the fact that the line concerning hospital 103 and patient 3 should be excluded from the table. If you look at tables A and B, you will find this patient-hospital combination in both of them. Thus this is logical to find it back in your final table when using inner joins.

I hope it helps.

SAS Procedures

Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

Re: Proc SQL - inner join and left join in one step

CAS is Fast - Left Join

SAS PROC SQL LEFT JOIN

Convert left join to hash data step

Left Join/ Update Problem

Is there a difference between LEFT Outer JOIN and RIGHT Outer JOIN?

Follow Us

What is...

SAS Procedures

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...