I see how this would work, but if I'm correct you would need data by zip code for this to work, correct? I have data by my region label and I am trying to use a separate data set that links zip codes to my regions.
So you don't have a mapping of zipcodes to regions?
If you do, then merge the information together.
I have a mapping of zip codes to regions, but I have 9 regions and 40k zip codes, so my choro variable will be repeated 40k times. My thought was to use the zip codes to create my regions on the map, then add the data by region, but I guess that won't work.
My other idea was to try and concatenate all the zip codes into a string so that the format option would work. However, one of my regions has nearly 10k zip codes, so at 5 digits a piece, it exceeds the maximum string length.
I guess this just won't work. It sound like I need to work on getting data by zip code, rather than region. Then maybe I can use the format option to create my regions and sum my data by region.
hi ... here is something similar to what you are trying to do
me ... take the counties in New York State and form new geographic areas (HSAs) based on a set of rules that assign counties to HSAs
you ... take zip codes in US and and form new geographic areas (regions) based on a set of rules that assign zip codes to regions
I assume that you have a map data set with zip code boundaries
you will have to create a new map data set with regions and you do that with PROC GREMOVE (see below) ... then you could create a choropleth map that is shaded by region based on you response data set
this code started with a county-based map from the SAS MAPS library (maps.counties) and ends up with an HSA-based map ... 8 HSAs are shown instead of the 62 New York State counties
this example is in a paper ... "The Basics of Map Creation with SAS/Graph" ... and you can get the paper plus all the examples at ...
http://www.albany.edu/~msz03/map_basics.zip
* rules for county to HSA conversion;
proc format;
value cou2hsa
3, 9, 13, 29, 37, 63, 73, 121 = '1'
15, 51, 55, 69, 97, 99, 101, 117, 123= '2'
11, 23, 43, 45, 49, 53, 65, 67, 75, 89, 109 = '3'
7, 17, 107 = '4'
1, 19, 21, 25, 31, 33, 35, 39, 41, 57, 77, 83, 91, 93, 95, 113, 115 = '5'
27, 71, 79, 87, 105, 111, 119 = '6'
5, 47, 61, 81, 85 = '7'
59, 103 = '8'
;
* assign new geographic designation (HSA) to each New York State county from maps.counties data set;
data hsatemp;
set maps.counties;
where state eq 36 and density lt 6;
hsa = put(county,cou2hsa.);
run;
* sort map data set in order by new geographic designation;
proc sort data=hsatemp;
by hsa;
run;
* eliminate county boundaries with HSA areas using PROC GREMOVE;
proc gremove data=hsatemp out=hsamap;
by hsa;
id county;
run;
* project the map;
proc gproject data=hsamap out=hsaproj;
id hsa;
run;
* create a dummy response data set ... one observation per new area;
proc sql;
create table hsa as
select distinct hsa, 1 as var
from hsamap;
quit;
* draw the map;
goptions reset=all ftext='calibri' htext=2 gunit=pct;
pattern v=ms c=blue;
proc gmap
data=hsa
map=hsaproj;
id hsa;
choro var / discrete coutline=white nolegend;
note ' HEALTH SERVICE AREAS'
j=l ' AREAS FORMED FROM COUNTIES'
j=l ' USING PROC GREMOVE';
run;
quit;
I changed the first part to:
proc format;
value cou2hsa
99501, 99502, 99503, 99504, 99505, 99506, 99507, 99508, 99509, 99510, 99511, 99513, 99514, 99515, 99516, 99517, 99518, 99519, 99520, 99521, 99522, 99523, 99524, 99529, 99530, 99540, 99556, 99557, 99566, 99567, 99568, 99572, 99573, 99574, 99575, 99577, 99586, 99587, 99588, 99599, 99603, 99605, 99610, 99611, 99615, 99619, 99627, 99629, 99631, 99635, 99639, 99640, 99645, 99647, 99652, 99653, 99654, 99656, 99663, 99664, 99667, 99668, 99669, 99672, 99674, 99675, 99676, 99677, 99682, 99683, 99686, 99687, 99688, 99689, 99691, 99693, 99694, 99695, 99697, 99701, 99702, 99703, 99704, 99705, 99706, 99707, 99708, 99709, 99710, 99711, 99712, 99714, 99716, 99720, 99724, 99725, 99729, 99730, 99731, 99732, 99733, 99737, 99738, 99740, 99743, 99744, 99755, 99756, 99757, 99758, 99760, 99764, 99767, 99768, 99774, 99775, 99776, 99777, 99779, 99780, 99781, 99790, 99801, 99802, 99803, 99811, 99820, 99821, 99824, 99825, 99826, 99827, 99829, 99830, 99832, 99833, 99835, 99836, 99840, 99841, 99850, 99901, 99903, 99918, 99919, 99921, 99922, 99923, 99925, 99926, 99927, 99928, 99929, 99950 = 'Region1'
other = 'Other Regions';
Run;
data hsatemp;
set sashelp.zipcode;
hsa = put(ZIP,cou2hsa.);
run;
That seems to work up until gproject, where I get this error:
ERROR: At least one value is out of range for polar radian coordinates. Data may already be projected.
ERROR: Expected range is (-3.141592654, 3.1415926536) for X and (-1.570796327, 1.5707963268) for Y.
ERROR: Actual range is (-176.658889, 171.062651) for X and (5.310246, 70.66976) for Y.
I think the problem comes from the fact that all these examples use shapefiles whereas the sashelp.zipcode uses center of zip
hi ... looks like your map data set is in degrees with negative longitude .. try these options .in GPROJECT..
proc gproject data=hsamap out=hsaproj degrees eastlong;
id hsa;
run;
hi ... so here's a zip code example that starts with converting a shapefile to a map data set
the attached SAS code produced the attached map and other than a couple of missing areas in the Adirondack mountains (rest of blank areas are water ... lakes, rivers), it looks OK (used V9.3 and I let SAS/Graph select the colors)
I got the shapefile from ... http://www.census.gov/cgi-bin/geo/shapefiles2010/main
if you are not familiar with importing shapefiles, take a look at ... From Shapefiles to Maps Via PROC MAPIMPORT and PROC GMAP
Hi ... I just noticed that you seem to be trying to use SASHELP.ZIPCODE as a map data set ...
data hsatemp;
set sashelp.zipcode;
hsa = put(ZIP,cou2hsa.);
run;
You cannot create a map with SASHELP.ZIPCODE. It's just a resource file. If you look at the example I posted that combines zips into area codes, you'll see that you'll have to get a zip-based map file to create map.
Interesting. Out of curiosity, why can't you use the SASHELP.ZIPCODE as a map data set? Is it because the use center of zip rather than a shape file like U.S. States? I'm looking into creating our own shape files for the different regions we want to map.
Hi ... map data sets contain x/y variables that define polygons (map areas ... zips, states, counties, countries, whatever). SASHELP.ZIPCODE contains x/y variables that just identify the centroid of a zip code. There's only one x/y coordinate per zip ... not enough to draw a polygon !!!
Since SAS/Graph will work with any set of x/y variables, it will actually draw a map of NYS using SASHELP.ZIPCODE. However, since there's only one point per zip in the data set, you cannot draw zip polygons, but you can have GMAP just "connect the centroids" of a state map ...
data nys;
set sashelp.zipcode;
where state eq 36;
keep state zip x y;
run;
proc gproject data=nys out=pnys eastlong degrees;
id state;
run;
proc gmap data=pnys map=pnys;
id state;
choro zip / nolegend;
run;
quit;
That SAS code produced the attached NYS_SASHELP_ZIPCODE map ... not what you want. If you go back to the example I posted, I said that ...
" I got the shapefile from ... http://www.census.gov/cgi-bin/geo/shapefiles2010/main "
If I use it with this SAS code ...
* ZCTA shapefile import for New York;
proc mapimport datafile='z:\tl_2010_36_zcta510.shp' out=nyzip;
select zcta5ce10;
rename zcta5ce10=zip;
run;
* ZCTA data set contains unprojected X/Y coordinates ... project them;
proc gproject data=nyzip out=pnyzip degrees eastlong;
id zip;
run;
* create a dummy response data set ... one observation per areacode;
proc sql;
create table dummy as select distinct zip, 1 as var from pnyzip;
quit;
goptions reset=all;
* make a map;
proc gmap map=pnyzip data=dummy;
id zip;
choro var / discrete nolegend;
run;
quit;
I get a real zip code map ... attached as NYS_ZIP ... it has some flaws, but those flaws are in the shapefile.
Hopefully that helps you understand that SASHELP.ZIPCODE cannot be used to create a map and that you can get free shapefiles and create maps with PROC GMAP
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.