I have a quick question about correlated subqueries in PROC SQL. I am curious whether a specific functionality that exists in Microsoft's SQL Server also exists in PROC SQL.
Please see the queries below. Query #1, involving a simple join, runs fine in both PROC SQL and SQL Server's T-SQL. Query #2, though, which uses a correlated subquery within the SELECT clause, works great in SQL Server's T-SQL but not in PROC SQL. Would you know whether this functionality exists in some other way in PROC SQL? If it does, what might the syntax look like?
Thanks in advance for your help. And please: I know this is a non-standard thing to do in SQL and I know it is inefficient. I'm just curious whether the functionality exists in PROC SQL. There are situations where it is useful. Thanks again!
PROC SQL; *---- create some play data -------------------------------------*; CREATE TABLE table1 (ID INT, FirstName CHAR(12)); INSERT INTO table1 (ID, FirstName) VALUES (1, "Tom" ); INSERT INTO table1 (ID, FirstName) VALUES (2, "Richard" ); INSERT INTO table1 (ID, FirstName) VALUES (3, "Harry"); CREATE TABLE table2 (ID INT, HomeTown CHAR(12)); INSERT INTO table2 (ID, HomeTown) VALUES (2, "Boston"); INSERT INTO table2 (ID, HomeTown) VALUES (3, "Miami" ); *----------------------------------------------------------------*; *Query #1 -- this works fine in both PROC SQL and SQL Server's T-SQL; SELECT table1.ID, table1.FirstName, table2.HomeTown FROM table1 LEFT JOIN table2 ON (table1.ID = table2.ID); *----------------------------------------------------------------*; *Query #2 -- This works fine in SQL Server's T-SQL but apparently not in PROC SQL; SELECT table1.ID, table1.FirstName, HomeTown = (SELECT HomeTown FROM table2 WHERE table2.ID=table1.ID)
FROM table1; *----------------------------------------------------------------*;
QUIT;
It will run, it just doesn't do anything useful.
data table1;
input id FirstName $10.;
datalines;
1 Tom
2 Richard
3 Harry
;
run;
data table2;
input id HomeTown $10.;
datalines;
2 Boston
3 Miami
4 Nome
;
run;
proc sql;
SELECT table1.ID,
table1.FirstName,
(SELECT HomeTown FROM table2 WHERE table2.ID=table1.ID) as hometown
FROM table1, table2;
quit;
Provides
The SAS System 09:15 Monday, December 18, 2017 2
id FirstName hometown
1 Tom
1 Tom
1 Tom
2 Richard Boston
2 Richard Boston
2 Richard Boston
3 Harry Miami
3 Harry Miami
3 Harry Miami
It will run, it just doesn't do anything useful.
data table1;
input id FirstName $10.;
datalines;
1 Tom
2 Richard
3 Harry
;
run;
data table2;
input id HomeTown $10.;
datalines;
2 Boston
3 Miami
4 Nome
;
run;
proc sql;
SELECT table1.ID,
table1.FirstName,
(SELECT HomeTown FROM table2 WHERE table2.ID=table1.ID) as hometown
FROM table1, table2;
quit;
Provides
The SAS System 09:15 Monday, December 18, 2017 2
id FirstName hometown
1 Tom
1 Tom
1 Tom
2 Richard Boston
2 Richard Boston
2 Richard Boston
3 Harry Miami
3 Harry Miami
3 Harry Miami
Thanks so much! And what your code does is VERY useful -- there are many instances where a correlated subquery is much easier and straightforward than a ton of reflexive joins. I know this is a comically simple example, but rest assured you've saved me a LOT of work!
I just had to switch your code so that it does a LEFT join with the second table and it gave me exactly what I needed (see below)
Thanks again for your help! I figured SAS had this capability. The syntax is just a tiny bit different from that of SQL Server, and I had searched all over for it without success.
SELECT table1.ID,
table1.FirstName,
(SELECT HomeTown FROM table2 WHERE table2.ID=table1.ID) AS HomeTown
FROM table1 LEFT JOIN table2 on table1.id=table2.id;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.