BookmarkSubscribeRSS Feed

How to create scatterplots, bubbleplots and choropleth maps in SAS (using OpenStreetMap and EsriMap)

Started ‎10-01-2024 by
Modified ‎09-20-2024 by
Views 697

Nowadays, data visualization has become an increasingly relevant topic. Being able to represent data in a clear and visually appealing manner is crucial to facilitate the understanding of complex data.

This article focuses on geographical data visualization, specifically maps that depict data from subjects located in Veneto, a region of northeastern Italy.

The maps were created using the sgmap procedure, which allows for overlaying scatterplots, bubble plots, or text on third-party maps. To select the basemap, one can refer to online services like Esri and OpenStreetMap.

 

Scatterplot

Let’s take the scatterdata dataset as an example. It includes variables such as ID, latitude, and longitude for each subject.

data scatterdata;
input subject_ID lat lon;
datalines;
1 39.1 10.6
2 39.2 10.7
3 39.3 10.8
…;
run;

 

The following image represents a scatterplot overlaid on a Veneto basemap from Esri. Using the sgmap procedure, the URL of the basemap is specified (in this case, we chose the World Street Map from Esri). By default, SAS provides a map that encompasses all the data in the dataset. However, in this specific case, the focus was on a particular location, therefore, the area within which the data should be visualized was manually defined.

 

image.png

 

Once the basemap is created, the scatterplot is overlaid onto the map, with longitude on the x-axis and latitude on the y-axis. Subsequently, the legend label and point appearance in the graph are specified (in this instance, black points with size 3 were chosen to represent the data).

proc sgmap plotdata = scatterdata; 
esrimap url='https://sampleserver6.arcgisonline.com/arcgis/rest /services/World_Street_Map/MapServer' ;
where lon > 10.4 and lat > 38.9 and lat < 40.6; 
scatter x = lon y = lat /
markerattrs=(color=black size=3 symbol=circlefilled); 
run;

 

Bubble plot

Let’s take the bubbledata dataset as an example. The variables are the frequency of subjects in a place, its latitude and its longitude.

data bubbledata;
input freq lat lon;
datalines;
1 40.4 10.3
2 39.8 11.2
3 39.4 10.7
…;
run;

 

The following image represents a bubble plot overlaid on a Veneto basemap sourced from OpenStreetMap. The sgmap procedure is used once again, but in this case there’s no need for a basemap URL since OpenStreetMap is a default service in SAS; only the command “openstreetmap” is necessary. As previously done, the bounding box is manually specified to focus on the desired area.

 

image.png

 

To create the bubble plot, one must specify the longitude on the x-axis and the latitude on the y-axis. The bubble size is the frequency of the observations at the specific coordinates. Additional specifications include the label, the minimum size and maximum size of the bubbles and the bubble appearance. In this instance, pink bubbles with 30% transparency were chosen to allow for potential overlaps within the map.

proc sgmap plotdata = bubbledata; 
openstreetmap; 
where lon > 10 and lat > 38.9; 
bubble x=lon y=lat size=freq / 
bradiusmin=0.2 bradiusmax=0.6 
fillattrs=(color=pink transparency=.3); 
run;

 

Choropleth map

Let’s take the chorodata dataset as an example. The dataset is almost the same as the scatterdata dataset; the difference is that only observations inside the bounding box of the map are kept.

data chorodata; 
set scatterdata;
where lon between 10 and 12 and 
lat between 38.9 and 40.6;
run;

 

This is a choropleth map of the Veneto region, designed to visualize data distribution across specific areas within the region. The map consists of hexagonal containers, each representing a distinct area. The color of each container varies based on the number of records in the area. Lighter colors represent lower frequency, while darker shades indicate higher frequency.

 

image.png

 

The output of the graph is redirected to the HexMap dataset and, using the proc surveyreg, a grid of containers is created.
The command plot(nbins=70 weight=heatmap)=fit(shape=hex) specifies the desired number of containers for the map (in this case, 70x70). These hexagonal containers will assume different colors based on the weight of observations.

ods output fitplot=work.HexMap;

proc surveyreg data=chorodata 
plots(nbins=70 weight=heatmap)=fit(shape=hex);
model lat=lon;
run;

 

Once the containers are created, data categorization formats are created using the proc format.

proc format;
  value hexgrp 1="A"          
               2-4="B"
               5-9="C"
               10-49="D"
               50-99="E"
               100-199="F"
               200-399="G"
               400-high="H";
  value $hexgrp "A"="1"
                "B"="2-4"
                "C"="5-9"
                "D"="10-49"
                "E"="50-99"
                "F"="100-199"
                "G"="200-399"
                "H"="400+";
run;

 

Two new datasets, hexmap_map and hexmap_resp, are created from the previously created hexmap dataset. The observations with missing IDs are excluded, and the data is sorted by ID. The processed data is then stored in the hexmap_map dataset.

For each unique ID value, the last observation is identified. The value of wvar is converted to the hexgrp format and assigned to gvar. The $hexgrp format is then applied to the gvar variable. The resulting data is recorded in the hexmap_resp dataset.

data work.hexmap_map(keep=id x y) 
     work.hexmap_resp(keep=id wvar gvar);
set work.hexmap(rename=(hid=id xvar=x yvar=y));
where id ne .; 
by id;
output work.hexmap_map;
if last.id; 
gvar=put(wvar, hexgrp);
format gvar $hexgrp.;
output work.hexmap_resp;
run;

 

The data is sorted by gvar to ensure the correct color application. Custom colors are then defined for observation ranges. In this case, for each of the eight groups, a desired color and line type is specified.

Note that the graph displayed in the SAS output may not show the customized colors from the template. However, if the graph is saved in PNG format, the colors will be correctly applied to the map.

proc sort data=work.hexmap_resp; by gvar; run;

ods path(prepend) work.templat(update);

proc template;
define style styles.myrampstyle;
parent=styles.htmlblue;
style GraphData1 from GraphData1 / contrastcolor=CXB1E599 linestyle=1;
style GraphData2 from GraphData2 / contrastcolor=CX9EEDA8 linestyle=1;
style GraphData3 from GraphData3 / contrastcolor=CX88E55C linestyle=1;
style GraphData4 from GraphData4 / contrastcolor=CX52CC62 linestyle=1;
style GraphData5 from GraphData5 / contrastcolor=CX16A629 linestyle=1;
style GraphData6 from GraphData6 / contrastcolor=CX44A616 linestyle=1;
style GraphData7 from GraphData7 / contrastcolor=CX5B993D linestyle=1;
style GraphData8 from GraphData8 / contrastcolor=CX118044 linestyle=1;
style GraphColors from graphcolors / 
"gdata1" = CXB1E599 "gdata2" = CX9EEDA8 "gdata3" = CX88E55C 
"gdata4" = CX52CC62 "gdata5" = CX16A629 "gdata6" = CX44A616 
"gdata7" = CX5B993D "gdata8" = CX118044;
end; run;

 

Proc sgmap is used once again to create the map; mapdata is specified to define the shape and position of map units, while the maprespdata dataset contains the actual data to display on the map. Similar to previous examples, this data is overlaid on an OpenStreetMap basemap.
The choromap statement is then specified, and the containers are colored based on the values of the gvar variable. At last, a legend is added to explain the colors used in the map.

proc sgmap mapdata=work.hexmap_map maprespdata=work.hexmap_resp;
openstreetmap;
choromap gvar / name="hexes" lineattrs=(color=gray) transparency=0.25;
keylegend "hexes" "Nodes";
run;

 

 

Version history
Last update:
‎09-20-2024 05:07 AM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags