BookmarkSubscribeRSS Feed
aabbccwyt
Obsidian | Level 7

Hi, I'm trying to use SQL to inner join 2 SAS files, here is what I have:

proc sql;
	select Rank.NOC, Rank.Country, Gold_2018, Silver_2018, Bronze_2018
	from ab.rank2018 inner join ab.rank2016
	on rank2018.NOC=rank2016.NOC
	on rank2018.Country=rank2016.Country;
quit;

My question is that because Country and NOC occur in both tables, I have the use the table prefix to indicate which column to use, which I choose olympics2016, however it doesn't work, it still gives me two table. Can someone please help me to fix it? Thank you.

7 REPLIES 7
SASKiwi
PROC Star

Your join criteria doesn't look right. The second ON should be AND.

proc sql;
	select Rank.NOC, Rank.Country, Gold_2018, Silver_2018, Bronze_2018
	from ab.rank2018 inner join ab.rank2016
	on rank2018.NOC=rank2016.NOC
	and rank2018.Country=rank2016.Country;
quit;
aabbccwyt
Obsidian | Level 7

Now it says: Unresolved reference to table/correlation name Rank.

I know it has to do something with the second ON but I'm not sure what is it. 

jpprovost
Quartz | Level 8

When you use RANK.tablename, I suppose RANK is a defined library?

Also, where comes the 2018 and 2016 in the name of the library RANK?

aabbccwyt
Obsidian | Level 7

They are sas7bdat files, here are my previous code if needed

proc print data=ab.rank2016;
proc print data=ab.rank2018;

ab is basically the path.

jpprovost
Quartz | Level 8

Maybe the "ab" part is missing ?

 

proc sql;
select  Rank.NOC, 
           Rank.Country, 
           Gold_2018, 
           Silver_2018, 
           Bronze_2018
	
from ab.rank2018 

inner join ab.rank2016 on 
   ab.rank2018.NOC=rank2016.NOC   and 
   ab.rank2018.Country=rank2016.Country;
quit;

 

Also, does the rank2016.xxxx have also the "ab" part in their part? 

PGStats
Opal | Level 21

Proper syntax would look like:

 

proc sql;
	select 
		a.NOC, 
		a.Country, 
		b.Gold_2018, 
		b.Silver_2018, 
		b.Bronze_2018
	from 
		ab.rank2016 as a inner join 
		ab.rank2018 as b 
		on a.NOC=b.NOC and a.Country=b.Country;
quit;
PG
ballardw
Super User

When SAS sees something like this in Proc SQL

on rank2018.NOC=rank2016.NOC

It expects Rank2018 and Rank2016 to be a single reference to a data set. But your datasets are in your library AB but the ON clause does not know that. So you provide an ALIAS n the From clause  to have a single reference such as:

from ab.rank2018 as rank2018 inner join ab.rank2016 as rank2016

so references such as rank2018.NOC can get back to the actual data source for the NOC.

The AS is optional in assigning the alias but I think it helps readability.

Most people tend to use something shorter for the alias like T1 (for table 1) or A to avoid any extra typing when referencing the variables. You can see that typing A.Noc is quicker than Rank2018.Noc.

 

To reference two variables with the same names in the source the data sets you would use AS to rename one or both:

 

Select a.noc, b.noc as NOC2016

 

for example would have an output variable named NOC that came from the aliased set A and another named NOC2016 that was the value of the aliased set B Noc variable. (Yet another reason for short aliases)

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 863 views
  • 3 likes
  • 5 in conversation