Hi,
Sorry the Subject may be confusing.
Let's say I have two tables:
Table One:
ID x
1 a
2 b
Table Two:
ID y
1 A
3 C
proc sql;
create table want as
select *
from TableOne as One, TableTwo as Two
where One.ID = Two.ID;
quit;
Desired output:
1 a A
The above code will work but will give a WARNING and use the ID from TableOne. The correct version would be changing the SELECT statement to:
select One.ID, x, y
However, the solution only works if there's 2 variables to list. What if there are hundreds? I know it can be done using a MERGE but is there a way to write it in PROC SQL?
Thanks
Please do not confuse data step MERGE with any of the joins. It does several things quite differently than any of the joins.
If a data step merge works for you then use that.
Your given join in Proc SQL is a Cartesian Join, meaning you would get every value of id that matches so that if you had Id=1 2 times in One and 3 times in Two you would get 6 output lines. Is that what you want?
I suggest that you go back to your example data and show example with multiple observations with the same id and then desired/expected output. Provide that example data in the form of data step code pasted into a text box such as:
Data One; input ID x $; datalines; 1 a 2 b ; data two; input ID y $; datalines; 1 A 3 C ;
The desired output should be shown in a similar fashion.
@ballardw wrote:
...
Your given join in Proc SQL is a Cartesian Join, meaning you would get every value of id that matches so that if you had Id=1 2 times in One and 3 times in Two you would get 6 output lines. Is that what you want?
@ballardw I believe you missed the where clause. This is an equijoin.
@Patrick wrote:
@ballardw wrote:
...
Your given join in Proc SQL is a Cartesian Join, meaning you would get every value of id that matches so that if you had Id=1 2 times in One and 3 times in Two you would get 6 output lines. Is that what you want?
@ballardw I believe you missed the where clause. This is an equijoin.
Don't believe I did. It is only an "equjoin" because neither set had the ID duplicated.
Data One; input ID x $; datalines; 1 a 1 b 2 b ; data two; input ID y $; datalines; 1 A 1 B 1 C 3 C ; proc sql; create table want as select * from One, Two where One.ID = Two.ID; quit;
Yields 6 rows of output. Which is why I asked for clarification as to what was actually intended for the output when ID values are duplicated.
@ballardw From the SAS docu Join Operations
You cannot create a DATASET with two variables named ID. That is why you only get one variable. Does not really have anything to do with PROC SQL or SQL syntax per se.
Do you really need a dataset? If you just want a REPORT then just drop the CREATE TABLE part of the query and it will execute. You could use ODS EXCEL to send the report to a file you could browse in a tabular fashion.
But unless you either change the data or the code It won't return anything because 'a' does NOT equal 'A'.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.