Since certain 5-digit zipcodes can cross county lines, how does SAS decide which county to assign to a 5digit ZIP in the ZIPCOSE dataset? My understanding is that one needs 9-digit zips to make the proper match. If that is the case, is there a 9-digit zip to State & County dataset that does this?
SAS does not decide where to locate the centroids in the SASHELP.ZIPCODE data set. That data is not created by SAS but is purchased from zipcodedownload.com.
By 9-digit ZIP I assume you mean ZIP+4. The only ZIP+4 data set currently supplied by SAS is the PROC GEOCODE lookup data used for ZIP+4 geocoding. It was generated from Census Bureau TIGER files and is in the downloads section of the SAS Maps Online site. However, the current data set does not contain county values.
When it is updated, we will add county values if possible. However, the latest release of TIGER/Line files does not contain ZIP+4 values. With the recent change from the older RT format files to shapefiles and dbf files, the ZIP+4 values were omitted. The variables are there, but all the values are missing.
I have been playing with the SAS zip + 4 file and I am now confused. My downloaded SAS zip + 4 file contains just over 16 million records. The US Postal Service zip + 4 file, which we have, contains around 43 million records. Is the SAS file complete? What is the difference?
In it the author states, "The Address Information System Products Technical Guide states that the complete ZIP+4 Product contains approximately 35 million records. Compared with the ZIP+4 product, the TIGER/ZIP+4 data product contains fewer records."
If a listing of ZIP+4 data that is more complete than what we can provide using the TIGER products is needed, you may have to contact third party data vendors. I am not sure if I can name specific firms here but a web search should turn up several. There is also Dataflux which specializes in US Postal Service data and address cleansing and validation. I am pretty sure I can mention them by name as they are a SAS subsidary. :-)
I have discussed the ZIP+4 status with tech support at the Census Bureau. Their current MAF/TIGER Accuracy Improvement Project and the upcoming 2010 census should greatly improve their data quality. Until we can obtain and review the 2010 census data, what is on Maps Online is all we have.
Message was edited by: EdO@sas
Once you have a lat/long coordinate (be it a zipcode centroid, a zip+4 centroid, or better yet a coordinate obtained from street-level geocoding), you could then take that lat/long value and determine what county it's in by running "proc ginside" against that point, and a map like maps.counties.
Message was edited by: Robert Allison @ SAS
You could use something like this (the values I've hardcoded are for Wake County, NC)...
/* State name is easy, using SAS functions */
data foo; set foo;
/* County name is a little harder, doing a lookup */
create table foo as
select foo.*, cntyname.countynm
from foo left join maps.cntyname
on foo.state=cntyname.state and foo.county=cntyname.county;
That page contains tables by state or for the entire US. However, none of those tables are linked by ZIP code. The closest one would be the table linking Congressional Districts by ZCTA (ZIP code tabulation area). Be aware that ZCTAs are not complete, i.e. they do not contain every ZIP code in the US. Some ZCTAs will also include portions of more than one Congressional District.
There is also a table linking districts by county, but that is not a one-to-one relationship. Some counties will contain parts of multiple Congressional Districts. You'll see the same issue with the table linking districts by FIPS Place codes.
So, you can obtain a complete listing of US Congressional District codes, but it may be difficult to merge them into other data sets.
you can use the SAS code that is contained in the posting to automate the reading of the STATE+COUNTY look up ... the program produces both a data set and a spreadsheet of ZIPS in any given state, plus it shows up to three counties for each ZIP and the percentage of the population of that ZIP in each county (all the information is taken from the web site)
you can use the data set to augment to information in SASHELP.ZIPCODE