Selecting Data from More Than Two Tables
PAGE 66 of this SAS DOCUMENTATION PDF
http://support.sas.com/documentation/cdl/en/sqlproc/62086/PDF/default/sqlproc.pdf
Do you think the code is right at the where clause(in pink below)????
I believe it should have been just
us.Name = pc.Name and
pc.Code = c.State;
Could someone explain to me ??????
Thanks
proc sql outobs=10;
select us.Capital format=$15., us.Name ’State’ format=$15.,
pc.Code, c.Latitude, c.Longitude
from sql.unitedstates us, sql.postalcodes pc,
sql.uscitycoords c
where us.Capital = c.City and
us.Name = pc.Name and
pc.Code = c.State;
The data that you need could be located in more than two tables. For example, if you
want to show the coordinates of the capitals of the states in the United States, then you
need to join the UNITEDSTATES table, which contains the state capitals, with the
USCITYCOORDS table, which contains the coordinates of cities in the United States.
Because cities must be joined along with their states for an accurate join (similarly to
the previous example), you must join the tables on both the city and state columns of
the tables.
Joining the cities, by joining the UNITEDSTATES.Capital column to the
USCITYCOORDS.City column, is straightforward. However, in the UNITEDSTATES
table the Name column contains the full state name, while in USCITYCOORDS the
states are specified by their postal code. It is therefore impossible to directly join the
two tables on their state columns. To solve this problem, it is necessary to use the
POSTALCODES table, which contains both the state names and their postal codes, as
an intermediate table to make the correct relationship between UNITEDSTATES and
USCITYCOORDS. The correct solution joins the UNITEDSTATES.Name column to the
POSTALCODES.Name column (matching the full state names), and the
POSTALCODES.Code column to the USCITYCOORDS.State column (matching the
state postal codes).
title ’Coordinates of State Capitals’;
proc sql outobs=10;
select us.Capital format=$15., us.Name ’State’ format=$15.,
pc.Code, c.Latitude, c.Longitude
from sql.unitedstates us, sql.postalcodes pc,
sql.uscitycoords c
where us.Capital = c.City and
us.Name = pc.Name and
pc.Code = c.State;
I have placed the relevant parts of the documentation in a separate document.
You are asking why the constraint:
us.Capital = c.City
is necessary in the where statement.
I will turn this around and ask you to look at what you get if you do not include it.
You will get all rows where this is true:
us.Name = pc.Name and pc.Code = c.State;
We have not required that only the capital cities are used. As written we will take all the cities found in the C table. While a city in the C table may not be a state capital, it will still match with the US table by the state. For example the city of Anchorage is found in table C. This select statement will include a row for Anchorage with the state name, the appropriate Postal Code for that state, and its coordinates. It would have a blank for the capital, the first column in the report.
Too much text (my vacation starts tomorrow...), too little sample data.
Can't read anywhere that you run into any issues, have you executed the program, what where the results, and exactly what do you want explained?
Hi,
Thanks for the reply.
I just want to know if what he has in the documentation is right????
based on the columns he merged!!!
Thanks
I think this is right as we need to coordinates of the capitals of the states.
Hi,
In the last line of the explanation it says
"The correct solution joins the UNITEDSTATES.Name column to the
POSTALCODES.Name column (matching the full state names), and the
POSTALCODES.Code column to the USCITYCOORDS.State column (matching the
state postal codes)."
I still don't understand why he used
where us.Capital = c.City
In the table aliased C there is no CITY.........so how is he able to use that variable from C??????
Thanks
The documentation shows that column, output 3.12 - that being said, I'm not willing to dig through the entire documentation to find the files etc.
The key is the following section I believe, that is included in the documentation.
Joining the cities, by joining the UNITEDSTATES.Capital column to the
USCITYCOORDS.City column, is straightforward. However, in the UNITEDSTATES
table the Name column contains the full state name, while in USCITYCOORDS the
states are specified by their postal code. It is therefore impossible to directly join the
two tables on their state columns. To solve this problem, it is necessary to use the
POSTALCODES table, which contains both the state names and their postal codes, as
an intermediate table to make the correct relationship between UNITEDSTATES and
USCITYCOORDS. The correct solution joins the UNITEDSTATES.Name column to the
POSTALCODES.Name column (matching the full state names), and the
POSTALCODES.Code column to the USCITYCOORDS.State column (matching the
state postal codes).
Robert,
The coding appears to be appropriate. If you open up the Libraries using SAS Explorer and go to the MAPS, MAPSGFK, and MAPSSAS folderS, you will see all(?) the maps library tables. I do not see the tables named in the book, in the library, however someone could have invented them and written it to the 'sql' library that the code is executing. In my experience, the names in map tables are not "coordinated" and the value in a name in one table could be in a completely different name in another table. Perhaps that is what the author was trying to express.
Try it both ways and see what happens.
I have placed the relevant parts of the documentation in a separate document.
You are asking why the constraint:
us.Capital = c.City
is necessary in the where statement.
I will turn this around and ask you to look at what you get if you do not include it.
You will get all rows where this is true:
us.Name = pc.Name and pc.Code = c.State;
We have not required that only the capital cities are used. As written we will take all the cities found in the C table. While a city in the C table may not be a state capital, it will still match with the US table by the state. For example the city of Anchorage is found in table C. This select statement will include a row for Anchorage with the state name, the appropriate Postal Code for that state, and its coordinates. It would have a blank for the capital, the first column in the report.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.