Solved: When I perform a left/right outer join in PROC SQL, I seem to be losin...

fpb1 · Posted 10-18-2021 12:50 PM

Specifically, in data set C I am expecting the value of type in observation 7 to be X, not missing.

```

data a; 
input model type $ price;
datalines; 
1 W 84
2 R 143
3 R 113
4 W 74
5 W 76
6 C 93
;

data b;
input type $ discount;
datalines;
C .25
R .3
W .2
X .5
;
run;

PROC SQL;
CREATE TABLE C AS
SELECT *
FROM A RIGHT JOIN B
ON A.type=B.type;
QUIT;
run;

HB · Posted 10-18-2021 01:12 PM

Try running it as

PROC SQL;
CREATE TABLE C AS
SELECT b.*, a.*
FROM A RIGHT JOIN B
ON A.type=B.type;
QUIT;
run;

and see if that isn't more what you thought you would get.

View solution in original post

ballardw · Posted 10-18-2021 01:07 PM

@fpb1 wrote:

Specifically, in data set C I am expecting the value of type in observation 7 to be X, not missing.

```
data a; 
input model type $ price;
datalines; 
1 W 84
2 R 143
3 R 113
4 W 74
5 W 76
6 C 93
;

data b;
input type $ discount;
datalines;
C .25
R .3
W .2
X .5
;
run;

PROC SQL;
CREATE TABLE C AS
SELECT *
FROM A RIGHT JOIN B
ON A.type=B.type;
QUIT;
run;

Best would be to provide an example of what you want for the output. Note that depending on who runs the code row "7" could have different values entirely.

With your code example I get this in the log:

121  PROC SQL;
122  CREATE TABLE C AS
123  SELECT *
124  FROM A RIGHT JOIN B
125  ON A.type=B.type;
WARNING: Variable type already exists on file WORK.C.
NOTE: Table WORK.C created, with 7 rows and 4 columns.

which happens because BOTH data sets have a variable Type and you have not specified which data set you want the value from for the variable Type (for which rows). So typically first set with the variable provides the values, in this set A.

Also since you are requiring a match on type there is none for Type=X in data set B.

Perhaps you are looking for something more like:

PROC SQL;
CREATE TABLE C AS
SELECT coalescec(a.type,b.type) as type, price, discount
FROM A RIGHT JOIN B
ON A.type=B.type;
QUIT;

The coalesecec (and for numeric values, coalesce) returns the first value in the list. So in this case when a.type is missing it looks for b.type.

Serious concern at the validity of the join if you have multiple Type values without matches though.

fpb1 · Posted 10-18-2021 01:28 PM

Thank you. This is a situation I encounter every day, though: demographic information in one table, medical test results in others, not every person has had every medical test so there are lots of non-matches.

ballardw · Posted 10-18-2021 01:51 PM

@fpb1 wrote:

Thank you. This is a situation I encounter every day, though: demographic information in one table, medical test results in others, not every person has had every medical test so there are lots of non-matches.

If you do not have a personal identifier then you are asking for trouble. The proper matching variable for any medical information should be unique personal identifier.

If you don't have unique personal identifiers go talk to whoever is providing the data as they are setting up a situation with serious medical and financial risks (as in lawsuits).

HB · Posted 10-18-2021 01:12 PM

Try running it as

PROC SQL;
CREATE TABLE C AS
SELECT b.*, a.*
FROM A RIGHT JOIN B
ON A.type=B.type;
QUIT;
run;

and see if that isn't more what you thought you would get.

fpb1 · Posted 10-18-2021 01:25 PM

Yes, that works. I'm assuming that when I type

select *

or

select a.*, b.*

it is showing the first instance of each variable in the case of ambiguity? That makes sense.

When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

Re: When I perform a left/right outer join in PROC SQL, I seem to be losing information

SAS Innovate 2025: Call for Content

Classroom Training Available!