Hi all,
I have successfully created an annotated state map from geocoded address info using the states map. However, when I try to use the county map data set, it appears that the geocoding output does not match what information in the county map dataset. Here is a small program to illustrate the difficulty. Code doesn't get past proc gproject. It seems that the length of x and y of the map do not match the length of x and y in the geocoded output. thanks! Phil
:
libname lookup "D:\Documents\desi2023\frlanalyses\geocodedata__2006__ZIP4_Geocode_Data\data";
data zips;
input c zip;
anno_flag=1l
cards;
1 65203
2 65211
3 65201
;
proc geocode /* Invoke geocoding procedure */
method=plus4 /* Specify geocoding method */
lookup=lookup.zip4 /* Lookup data from MapsOnline */
data=zips /* Input data set to geocode */
out=geocodedzips; /* Specify name of Output data set of locations */
run;
data mocounties;set maps.counties;if state=29;
data mocounties;set mocounties geocodedzips;run;
proc gproject data=mocounties out=MOCountiesP; id state ; run;
data my_map locations; set mocountiesP;
if anno_flag=1 then output locations;
else output my_map;
run;
data anno_locations; set locations;
if c=1 then color="red";
if c=2 then color="bib";
if c=3 then color="viyg";
if c=4 then color="bippk";
xsys='2'; ysys='2'; hsys='3'; when='a';
function='pie'; rotate=360; size=.5;
style='psolid'; output;
length html $300;
html='title='||quote(trim(left(citystate))||'0d'x||trim(left(put(zip,z5.))));
style='pempty'; color='deb'; output;
run;
pattern1 v=e;
proc gmap data=mymap map=mymap anno=anno_locations;
id county;
choro segment / coutline=black levels=1 nolegend
coutline=gray99;run;
I had to seriously modify the starting geocode data set because you didn't paste it into a text box.
This creates a map with those 3 ZIP codes appearing.
Note that some differences are because the MY_MAP created with the geocoded data is different name than the MYMAP from your first posted code.
data WORK.GEOCODEDZIPS(label='Geocoded 15May2024'); infile datalines dlm=',' dsd truncover; length y x M_OBS 8 _matched_ $ 16 anno_flag 8 state 5 ; input Y X M_OBS _MATCHED_:$16. anno_flag state c zip:5.; label Y="Geocoded Latitude" X="Geocoded Longitude" ; datalines; 38.928422641,-92.36843533,.,ZIP mean,1,29,1,65203 38.948231,-92.327335,11274177,ZIP,1,29,2,65211 38.933052625,-92.29140936,.,ZIP mean,1,29,3,65201 ; data mocounties; set MAPSGFK.US_COUNTIES; if statecode="MO"; run; data mocounties2; set mocounties work.geocodedzips (rename=(x=long y=lat)); run; proc gproject data=mocounties2 out=MOCountiesp latlong eastlong degree; id state ; run; data my_map locations; set MOCountiesp; if anno_flag=1 then output locations; else output my_map; run; data anno_locations; set locations; if c=1 then color="red"; if c=2 then color="bib"; if c=3 then color="viyg"; if c=4 then color="bippk"; xsys='2'; ysys='2'; hsys='3'; when='a'; function='pie'; rotate=360; size=.5; style='psolid';
output; length html $300; html='title='||'0d'x||put(zip,z5. -L); style='pempty'; color='deb';
output; run; proc gmap data=my_map map=my_map anno=anno_locations; id county; choro segment / coutline=black levels=1 nolegend coutline=gray99; run; quit;
Personally for the example Zip codes I would have used some from different counties. The scale of statewide map is such that it may be impossible to see one or more zip code centers.
Your prior code for annotate data set couldn't quite be applied as there wasn't anything adding in the city variable to use in the title.
If your MAPS.COUNTIES data set is the on provided by SAS then there is no geographic information other than the projected boundary points provided by the X,Y variable.
If you have the MAPSGFK.US_COUNTIES data set that has Lat, Long variables likely to have a better shot for getting geocoded data to place correctly.
That's a great help! thanks! the only problem now is that Gproject only outputs one data point instead of the three geocoded zipcodes in the small data set. I.e.:
libname lookup "D:\Documents\desi2023\frlanalyses\geocodedata__2006__ZIP4_Geocode_Data\data";
data zips;
anno_flag=1;
state=29;
input c zip;
cards;
1 65203
2 65211
3 65201
;
proc geocode /* Invoke geocoding procedure */
method=plus4 /* Specify geocoding method */
lookup=lookup.zip4 /* Lookup data from MapsOnline */
data=zips /* Input data set to geocode */
out=geocodedzips; /* Specify name of Output data set of locations */
run;
proc contents data=MAPSGFK.US_COUNTIES;run;
data mocounties;set MAPSGFK.US_COUNTIES;if statecode="MO";run;
data mocounties;set mocounties geocodedzips;run;
proc means data=mocounties;var state;run;
proc gproject data=mocounties out=MOCountiesp latlong eastlong degree; id state ; run;
data my_map locations; set MOCountiesp;
if anno_flag=1 then output locations;
else output my_map;
run;
NOTE: There were 11766 observations read from the data set WORK.MOCOUNTIESP.
NOTE: The data set WORK.MY_MAP has 11765 observations and 18 variables.
NOTE: The data set WORK.LOCATIONS has 1 observations and 18 variables.
Can you provide the result of your GeocodedZips data set from Proc Geocode in the form of a working data step? We don't have your lookup data set to see what the result might be.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against. Note that the code box is important as the main message window will reformat pasted text and can result in code that won't run properly.
thanks all. OK, I *think* I did this correctly:
data WORK.GEOCODEDZIPS(label='Geocoded 15May2024');
infile datalines dsd truncover;
input Y:32. X:32. M_OBS:32. _MATCHED_:$16. anno_flag:32. state:32. c:32. zip:32.;
label Y="Geocoded Latitude" X="Geocoded Longitude";
datalines;
38.928422641 -92.36843533 . ZIP mean 1 29 1 65203
38.948231 -92.327335 11274177 ZIP 1 29 2 65211
38.933052625 -92.29140936 . ZIP mean 1 29 3 65201
;;;;
or, alternatively, if you want to see the proc print. I notice it's got X and Y, but not longitude and latitude. :
38.9284 | -92.3684 | . | ZIP mean | 1 | 29 | 1 | 65203 |
38.9482 | -92.3273 | 11274177 | ZIP | 1 | 29 | 2 | 65211 |
38.9331 | -92.2914 | . | ZIP mean | 1 | 29 | 3 | 65201 |
the Gmap call is:
proc gproject data=mocounties out=MOCountiesp latlong eastlong degree; id state ; run;
so maybe it's confused given that geocode didn't produce longitude and latitude.
rerunning the proc gproject removing with latlong option, however, doesn't fix the problem. thanks for all the help! Phil
I would suggest renaming the variables to match the lat and long in the map data set.
BEFORE combining with the data and doing the projection.
Look at the selected county data a check on the sign of the long, may need to insure that matches with the values in your map set.
I did that- same result, alas- thanks for responding, though!
I had to seriously modify the starting geocode data set because you didn't paste it into a text box.
This creates a map with those 3 ZIP codes appearing.
Note that some differences are because the MY_MAP created with the geocoded data is different name than the MYMAP from your first posted code.
data WORK.GEOCODEDZIPS(label='Geocoded 15May2024'); infile datalines dlm=',' dsd truncover; length y x M_OBS 8 _matched_ $ 16 anno_flag 8 state 5 ; input Y X M_OBS _MATCHED_:$16. anno_flag state c zip:5.; label Y="Geocoded Latitude" X="Geocoded Longitude" ; datalines; 38.928422641,-92.36843533,.,ZIP mean,1,29,1,65203 38.948231,-92.327335,11274177,ZIP,1,29,2,65211 38.933052625,-92.29140936,.,ZIP mean,1,29,3,65201 ; data mocounties; set MAPSGFK.US_COUNTIES; if statecode="MO"; run; data mocounties2; set mocounties work.geocodedzips (rename=(x=long y=lat)); run; proc gproject data=mocounties2 out=MOCountiesp latlong eastlong degree; id state ; run; data my_map locations; set MOCountiesp; if anno_flag=1 then output locations; else output my_map; run; data anno_locations; set locations; if c=1 then color="red"; if c=2 then color="bib"; if c=3 then color="viyg"; if c=4 then color="bippk"; xsys='2'; ysys='2'; hsys='3'; when='a'; function='pie'; rotate=360; size=.5; style='psolid';
output; length html $300; html='title='||'0d'x||put(zip,z5. -L); style='pempty'; color='deb';
output; run; proc gmap data=my_map map=my_map anno=anno_locations; id county; choro segment / coutline=black levels=1 nolegend coutline=gray99; run; quit;
Personally for the example Zip codes I would have used some from different counties. The scale of statewide map is such that it may be impossible to see one or more zip code centers.
Your prior code for annotate data set couldn't quite be applied as there wasn't anything adding in the city variable to use in the title.
thanks so much! the actual data set I'm using is much larger, but I wanted to just give a small example of the code for problem solving. This works fine! Phil
Calling @GraphGuy (Robert.Allision)
Here's how to do it without combining/projecting/separating the map & data, by using the somewhat new feature of saving the projection parameters for the map and then applying them to the annotate dataset:
data zips;
input c zip;
cards;
1 65203
2 65211
3 65201
;
proc geocode data=zips out=geocodedzips (rename=(x=long y=lat));
run;
data mocounties; set mapsgfk.us_counties (where=(statecode='MO'));
run;
proc gproject data=mocounties out=mocounties
latlong eastlong degrees parmout=projparm;
id county;
run;
proc gproject data=geocodedzips out=geocodedzips
latlong eastlong degrees parmin=projparm parmentry=mocounties;
id;
run;
data anno_locations; set geocodedzips;
if c=1 then color="red";
if c=2 then color="bib";
if c=3 then color="viyg";
if c=4 then color="bippk";
xsys='2'; ysys='2'; hsys='3'; when='a';
function='pie'; rotate=360; size=.5;
style='psolid';
output;
length html $300;
html='title='||quote(trim(left(c))||'0d'x||trim(left(put(zip,z5.))));
style='pempty'; color='deb';
output;
run;
pattern1 v=e;
proc gmap map=mocounties data=mocounties anno=anno_locations;
id county;
choro segment / coutline=black levels=1 nolegend
coutline=gray99;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.