Transform Data by Zip Code into US Regions

P5C768 · Posted 06-06-2012 06:24 PM

I have a data set that is listed out by zip code as the mapping coordinate but I would like to group those zip codes into a small number of regions (say 5 or 10). Is there a way to do this using the zipcodes or do I need to develop a shape file to define the regions?

For example, my data is similar to this:

Zip	Region	Count
90001	South	50
90002	South	75
90003	South	60
90004	West	75
90005	East	100

Can I use the zip codes to define the Regions and then sum the counts to display on the map? Or do I need to create something similar to the coordinates in the maps.us table to define regions? Thank you in advance for any help provided.

Reeza · Posted 06-06-2012 07:00 PM

You can use the sashelp.zip file to merge the zips into county or states. If you want different regions you need to figure out a way to group them individually, thought I'd think county would be a good way to start (knowing very little about this b/c I'm Canadian).

Alpay · Posted 06-07-2012 03:54 AM

If you have zip code and regions defined in a file like sashelp.zipcode you can create a format and use it to summarize your counts.

Zafer

Example:

data zipregion;

input Zip Region $;

datalines;

90001 South

90002 South

90003 South

90004 West

90005 East

;

run;

data zipcount;

input Zip Count;

datalines;

90001 50

90002 75

90003 60

90004 75

90005 100

;

run;

/* If you have zip codes and regions defined in a data set */

data zipfmt;

set zipregion;

retain fmtname 'zipfmt';

Start = Zip;

Label = Region;

keep fmtname Start Label;

run;

proc format cntlin=zipfmt;

run;

proc summary data=zipcount nway missing;

class Zip;

var Count;

format Zip zipfmt.;

output out=zipcount_summary

sum=;

run;

proc print; run;

P5C768 · Posted 06-11-2012 01:57 PM

Sorry, I should have been more clear. Once you have the data summarized by region, how can I link the zip codes with a map file, say sashelp.zipcode and only display regions rather than zip codes. Basically I want to define my map parameter by zip code, but display region rather than zip code, using proc gmap.

P5C768 · Posted 06-11-2012 02:20 PM

What I am trying to do is similar to what is done in this example:

http://support.sas.com/kb/24/902.html

But instead of writing out all 40,000 zipcodes, could I just group them somehow so the regions are assigned by a region field that is related to a zip code.

art297 · Posted 06-11-2012 02:40 PM

Can't you just use the zipstate function (see http://support.sas.com/kb/24/803.html ) and then do the region assignments as was done in the example that you showed?

P5C768 · Posted 06-11-2012 03:13 PM

Some of my regions are not defined by states unfortunately. California is split between two regions for example. That is why I thought of defining my regions by zip code.

Astounding · Posted 06-11-2012 04:46 PM

Doesn't Art have the right idea here? What if you were to use the code in the sample program, but change from ID STATE to ID ZIP? Isn't that exactly what you need? (You might have to sum up your COUNT variable separately.)

Alpay · Posted 06-11-2012 05:01 PM

If I unserstand it right, you want to drill down from region to zip codes and color it on a map and remove the inerior boundaries for regions. Do these regions have discontiguous areas?

You will probably need shape files for zip codes (www.census.gov).

Alpay · Posted 06-12-2012 03:43 AM

For example:

Download the following file from www.census.gov web site (2010 TIGER/Line® Shapefiles: )

2012 TIGER/Line Shapefiles: ZIP Code Tabulation Areas for Michigan.

You can download the shape files for other states as well and define your zip code to region format.

proc format;

value $zipfmt

'48001'-'48999' = 'Region 1'

'49000'-'49999' = 'Region 2'

other = 'Region 3';

run;

proc mapimport datafile="D:\Temp\Zip\tl_2010_26_zcta510.shp"

out=mi;

run;

proc sql;

create table mi_1 as

select -1*x as x,

y,

ZCTA5CE10 as Zip,

Segment,

put(ZCTA5CE10,$zipfmt.) as Region

from mi;

run;

proc sort data=mi_1;

by Region Zip Segment;

run;

proc gremove data=mi_1 out=mi_2;

by Region;

id Zip Segment;

run;

proc gproject data=mi_2 out=mi_prj degrees;

id Region;

run;

goptions device=png htext=3 gunit=pct cback=verylightmoderateyellow;

title;

proc gmap data=mi_prj map=mi_prj;

id Region;

choro Region / discrete

coutline=Black

nolegend;

run;

GraphGuy · Posted 06-12-2012 08:41 AM

And here's a similar example to Alpay's, using a county map (which is shipped with SAS/Graph), rather than zip code maps - it plots the map at a couple of points along the way, so you can see how we're building up the final map:

data my_map; set maps.uscounty (where=(fipstate(state) in ('NC' 'VA')));
original_order=_n_;
statecode=fipstate(state);
run;

proc gmap data=my_map map=my_map;
id state county;
choro state / discrete;
run;

data new_map; set my_map;
if statecode='VA' then region='A';
if statecode='NC' then region='B';

/* let's say these 4 counties in NC are really in region 'A' (with Virginia) */
if statecode='NC' and county in (77 69 181 185) then region='A';
run;

proc gmap data=new_map map=maps.uscounty;
id state county;
choro region;
run;

proc sort data=new_map out=new_map;
by region original_order;
run;

proc gremove data=new_map out=new_map;
by region;
id state county;
run;

proc gmap data=new_map map=new_map;
id region;
choro region;
run;

P5C768 · Posted 06-13-2012 03:56 PM

The methods that alpay and Robert listed work, I guess I'm just struggling with how to convert the 40k plus zip codes into an array. Alpay's example assumes region include sequential zip codes, which mine do not. Here is an output of a proc means to show the range:

Analysis Variable : ZIP The 5-digit ZIP Code
Region	N	Minimum	Maximum
Region1	5802	7002	61378
Region2	9786	544	88595
Region3	154	99501	99950
Region4	5191	19930	74966
Region5	7386	1005	78962
Region6	4709	59001	99403
Region7	1649	24314	49289
Region8	1159	501	93599
Region9	5498	39426	96158

Robert's example assumes that you only want to change a handful amount of zip or counties, rather then setting the region by a range of zip codes. I ran a proc transpose to convert my data to be in columns with each row representing a region. I'm thinking maybe I can create an array of each row, though some regions have nearly 10k records and I'm not sure what the limit would be. That way I could use Robert's method to set my regions by the zipcodes in my data.

Reeza · Posted 06-13-2012 04:08 PM

If you want to use Alpay's code and need to recreate this portion there's way to do that without using ranges:

proc format;

value $zipfmt

'48001'-'48999' = 'Region 1'

'49000'-'49999' = 'Region 2'

other = 'Region 3';

run;

My guess is you currently have data in the form something like the following:

Region ZipCode

1 48001

2 28003

3 34564

1 48002

Does that sound correct? If so you can create a format then use Alpay's code a few days ago, June 7 rather than mess with an array.

P5C768 · Posted 06-13-2012 04:31 PM

Correct, but my regions don't contain continuous zip codes.

I can't do this:

proc format;

value $zipfmt

'07002'-'61378' = 'Region 1'

'00544'-'88595' = 'Region 2'

run;

because the regions would overlap. That is why I was thinking of an array.

Alpay · Posted 06-13-2012 04:42 PM

You can create a format from your data set using proc format and specifying your format data set (cntlin option).

For further explanation please see the documentation for proc format in help files.

I have not tested the following code but it gives an idea as a starting point.

data zipfile;

input Region ZipCode $;

datalines;

1 48001

2 28003

3 34564

1 48002

;

run;

data zipfmt;

set zipfile;

retain zipfmt '$zipfmt';

Start =Zipcode;

Label = Region;

keep fmtname Start Label;

run;

proc format cntlin=zipfmt; run;

Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Re: Transform Data by Zip Code into US Regions

Registration is open

SAS Training: Just a Click Away