Data visualization with SAS programming

SAS Zipcode dataset and zip to County Match

Reply
Contributor
Posts: 73

SAS Zipcode dataset and zip to County Match

Since certain 5-digit zipcodes can cross county lines, how does SAS decide which county to assign to a 5digit ZIP in the ZIPCOSE dataset? My understanding is that one needs 9-digit zips to make the proper match. If that is the case, is there a 9-digit zip to State & County dataset that does this?
SAS Employee
Posts: 23

Re: SAS Zipcode dataset and zip to County Match

SAS does not decide where to locate the centroids in the SASHELP.ZIPCODE data set. That data is not created by SAS but is purchased from zipcodedownload.com.

By 9-digit ZIP I assume you mean ZIP+4. The only ZIP+4 data set currently supplied by SAS is the PROC GEOCODE lookup data used for ZIP+4 geocoding. It was generated from Census Bureau TIGER files and is in the downloads section of the SAS Maps Online site. However, the current data set does not contain county values.

When it is updated, we will add county values if possible. However, the latest release of TIGER/Line files does not contain ZIP+4 values. With the recent change from the older RT format files to shapefiles and dbf files, the ZIP+4 values were omitted. The variables are there, but all the values are missing.
Contributor
Posts: 73

Re: SAS Zipcode dataset and zip to County Match

I have been playing with the SAS zip + 4 file and I am now confused. My downloaded SAS zip + 4 file contains just over 16 million records. The US Postal Service zip + 4 file, which we have, contains around 43 million records. Is the SAS file complete? What is the difference?
SAS Employee
Posts: 23

Re: SAS Zipcode dataset and zip to County Match

The Census Bureau does not maintain that their ZIP-related products are complete or official. This NESUG paper by a Census Bureau analyst descibes her detailed examination of TIGER/ZIP+4 data:
http://www.nesug.org/proceedings/nesug06/ap/ap10.pdf

In it the author states, "The Address Information System Products Technical Guide states that the complete ZIP+4 Product contains approximately 35 million records. Compared with the ZIP+4 product, the TIGER/ZIP+4 data product contains fewer records."

If a listing of ZIP+4 data that is more complete than what we can provide using the TIGER products is needed, you may have to contact third party data vendors. I am not sure if I can name specific firms here but a web search should turn up several. There is also Dataflux which specializes in US Postal Service data and address cleansing and validation. I am pretty sure I can mention them by name as they are a SAS subsidary. :-)

I have discussed the ZIP+4 status with tech support at the Census Bureau. Their current MAF/TIGER Accuracy Improvement Project and the upcoming 2010 census should greatly improve their data quality. Until we can obtain and review the 2010 census data, what is on Maps Online is all we have. Message was edited by: EdO@sas
SAS Employee
Posts: 980

Re: SAS Zipcode dataset and zip to County Match

To add a little to what Ed says...

Once you have a lat/long coordinate (be it a zipcode centroid, a zip+4 centroid, or better yet a coordinate obtained from street-level geocoding), you could then take that lat/long value and determine what county it's in by running "proc ginside" against that point, and a map like maps.counties. Message was edited by: Robert Allison @ SAS
Contributor
Posts: 73

Re: SAS Zipcode dataset and zip to County Match

So, I used PROC GINSIDE and it worked...but it only gives me the state and county codes. Now I need to get the state and county NAMES. How do I do that? Thanks, John
SAS Employee
Posts: 980

Re: SAS Zipcode dataset and zip to County Match

You could use something like this (the values I've hardcoded are for Wake County, NC)...


data foo;
state=37;
county=183;
run;

/* State name is easy, using SAS functions */
data foo; set foo;
state_name=fipnamel(state);
state_abbrev=fipstate(state);
run;

/* County name is a little harder, doing a lookup */
proc sql;
create table foo as
select foo.*, cntyname.countynm
from foo left join maps.cntyname
on foo.state=cntyname.state and foo.county=cntyname.county;
quit; run;
Contributor
Posts: 73

Re: SAS Zipcode dataset and zip to County Match

Thanks to all for your help. Everything you have told me is working so far. Let me push the envelope....any hope for getting Congressional District Codes into the situation?
SAS Employee
Posts: 23

Re: SAS Zipcode dataset and zip to County Match

You can download Congressional District relationship tables from: http://www.census.gov/geo/www/cd110th/tables110.html

That page contains tables by state or for the entire US. However, none of those tables are linked by ZIP code. The closest one would be the table linking Congressional Districts by ZCTA (ZIP code tabulation area). Be aware that ZCTAs are not complete, i.e. they do not contain every ZIP code in the US. Some ZCTAs will also include portions of more than one Congressional District.

There is also a table linking districts by county, but that is not a one-to-one relationship. Some counties will contain parts of multiple Congressional Districts. You'll see the same issue with the table linking districts by FIPS Place codes.

So, you can obtain a complete listing of US Congressional District codes, but it may be difficult to merge them into other data sets.
Valued Guide
Posts: 765

Re: SAS Zipcode dataset and zip to County Match

hi ... there is a web site ...

http://www.melissadata.com/lookups/countyzip.asp

that will give you a table of ZIPs in a specified state+county but that requires a manual selection of state+county and all you get is an HTML table

if you look at ...

http://www.sascommunity.org/wiki/County_Validation_of_ZIP_Codes

you can use the SAS code that is contained in the posting to automate the reading of the STATE+COUNTY look up ... the program produces both a data set and a spreadsheet of ZIPS in any given state, plus it shows up to three counties for each ZIP and the percentage of the population of that ZIP in each county (all the information is taken from the web site)

you can use the data set to augment to information in SASHELP.ZIPCODE
Ask a Question
Discussion stats
  • 9 replies
  • 2673 views
  • 0 likes
  • 4 in conversation