Hi. First time using proc gmap to build a world map to visualize number of studies on a topic.
Here's where I am at with a trial image. It appears to be doing what I believe it should do thus far.
I do have a couple warnings in my log I wanted to ask about.
4267 title;
4268 /* Step 5: Generate choropleth map */
4269 proc gmap map=merged_map_clean data=merged_map_clean all;
4270 id idname;
4271 choro study_count2 / levels=5
4272 coutline=grayaa /* gray borders for countries WITH studies */
4273 missing
4274 legend=legend1;
4275 format study_count2 studyfmt.;
4276 run;
WARNING: Some observations were discarded when charting study_count2. Only first matching
observation was used. Use STATISTIC= option for summary statistics.
WARNING: Some observations were discarded when charting study_count2. Only first matching
observation was used. Use STATISTIC= option for summary statistics.
NOTE: Foreground color WHITE same as background. Part of your graph might not be visible.
NOTE: 47729 bytes written to C:\Users\XXXXXXXXXXXXX\gmap30.png.
WARNING: The specified value of 16.5833 inches for HSIZE= is larger than 16.0000 inches which
is the maximum for the device WIN. HSIZE is ignored.
WARNING: The specified value of 8.7188 inches for VSIZE= is larger than 8.3958 inches which is
the maximum for the device WIN. VSIZE is ignored.
4277 quit;
1. Would the first set of warnings refer to the multiple rows for each country from the mapsgfk.world map I merged with? That code is here:
proc sort data=mapsgfk.world out=map_sorted; by idname; run;
data merged_map_clean;
merge map_sorted(in=a) country_fixed(in=b);
by idname;
if a; /* Keep all countries from map */
/* Exclude Antarctica */
if upcase(idname) = "ANTARCTICA" then delete;
/* Create categorized study count variable */
if missing(study_count) then study_count2 = 0;
else if study_count = 1 then study_count2 = 1;
else if 1 < study_count <= 5 then study_count2 = 2;
else if 5 < study_count <= 10 then study_count2 = 3;
else if 10 < study_count then study_count2 = 4;
run;
2. Where are the size warnings coming from? I haven't set a size anywhere and can't figure how, or if I need to change.
Also, I've attached the countries and study counts I used to populate my image in this test run.
Thanks for your help!
Anwser to question 1:
The warning message comes from the statistic= option:
- STATISTIC=FIRST | SUM | FREQUENCY | MEAN
specifies the statistic for GMAP to chart. For character variables, FREQUENCY is the only allowed value--any other value is changed to FREQUENCY and a warning is issued. The frequency of a variable does not include missing values unless the MISSING option is specified.
FIRST GMAP matches the first observation from the DATA= data set and charts the response value from this observation only. This is the default. If more rows exist that are not processed, a warning is issued to the log.
SUM All observations matching a given ID value are added together and the summed value is charted.
FREQUENCY A count of all rows with nonmissing values is charted unless you specify the MISSING option.
MEAN All observations matching a given ID value are added together and then divided by the number of non-missing observations matched. This value is then charted unless you specify the MISSING option.
By adding statistic=first to choro statement, the warning message is gone.
Anwser to question 1:
The warning message comes from the statistic= option:
- STATISTIC=FIRST | SUM | FREQUENCY | MEAN
specifies the statistic for GMAP to chart. For character variables, FREQUENCY is the only allowed value--any other value is changed to FREQUENCY and a warning is issued. The frequency of a variable does not include missing values unless the MISSING option is specified.
FIRST GMAP matches the first observation from the DATA= data set and charts the response value from this observation only. This is the default. If more rows exist that are not processed, a warning is issued to the log.
SUM All observations matching a given ID value are added together and the summed value is charted.
FREQUENCY A count of all rows with nonmissing values is charted unless you specify the MISSING option.
MEAN All observations matching a given ID value are added together and then divided by the number of non-missing observations matched. This value is then charted unless you specify the MISSING option.
By adding statistic=first to choro statement, the warning message is gone.
You want this ?
data world(index=(idx=(id Segment)));
set mapsgfk.world(where=(CONT ne 97)); *Get rid of Antarctica Continent;
if ID='US' then group=4; *USA;
else if ID='CA' then group=3; *Canada;
else if ID='BR' then group=2; *Brazil;
else if ID='AU' then group=1; *Australia;
else if ID=:'CN' then group=1; *China;
else if ID='GB' then group=3; *United Kindom;
else if ID='DE' then group=4; *Germany;
else group=0; *Others;
run;
data world2;
set world;
by id Segment;
if first.Segment then PolyID+1; /* create ID variable for polygons */
run;
proc sort data=world2;
by group;
run;
proc format;
value fmt
0='0 studies'
1='1 studies'
2='2-5 studies'
3='6-10 studies'
4='>10 studies'
;
run;
ods graphics/width=1000px height=600px;
proc sgplot data=world2 noborder /*aspect=0.5 */ ;
styleattrs datacolors=(white CXBDD7E7 CX6BAED6 CX3182BD CX08519C);
polygon x=x y=y ID=PolyID / fill outline group=group lineattrs=(color=grayaa) ;
xaxis display=none offsetmin=0 offsetmax=0;
yaxis display=none offsetmin=0 offsetmax=0 ;
keylegend / position=bottom location=outside noborder title='Study Count' ;
format group fmt.;
run;
Thank you for the recommendation, @Ksharp .
I lost some of my country data when I tried with your code for some reason.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.