BookmarkSubscribeRSS Feed

How to Create Custom Geographic Variables for Use in VA GeoMap Hierarchies

Started ‎09-18-2020 by
Modified ‎07-06-2020 by
Views 2,826

The world is a big place!  And sometimes creating Visual Analytics reports that have detailed coordinates on a geo map can be challenging because your source data doesn't contain known geography columns from which you can a create a hierarchy.  If this is the case, when a report viewer wants to review a specific area of the map they will be forced to zoom and zoom and zoom in on the map for quite some time before they can finally view the data they are interested in. 

 

For example, imagine we have the following data containing the longitude and latitude for a set of 805 individual US zip codes:

 

Mike_Drutar_0-1593698455110.png

 

We want to visualize this in map as part of a Visual Analytics report.  Unfortunately, if we simply map these zip codes on a map, they are so tightly clustered that a user must zoom in quite a bit to view the individual data points.  So if the user is interested in the data located in Boston, Massachusetts it will take quite a bit of zooming in until they can see it:

 

ZoomMethod_sm.gif

 

This is less than ideal.  What would be great if the data contained some additional higher level geographic columns.  If it did we could create a hierarchy so the user could simply "Drill Down" on the map in just two clicks:

 

CustomHierarchy.gif

 

The drill down method for looking at the data near Boston is certainly the better way to allow our users to navigate the map.

 

Even though these additional columns aren't in our source data, that doesn't mean we can't use SAS to build them!  Because we have the full longitude and latitude for each zip code in our source data, we can use this to generate the additional columns needed for our hierarchy. This page explains how the various decimal places of a longitude and latitude coordinates represent different levels of geographic precision.  We can see that coordinates with decimal degrees of 0.1 represent a "Large City or District" and coordinates with decimal degrees of 0.001 represent a "Neighborhood or Street" and so on.  So all we need to do is create these levels of geographic precision by rounding our current longitude and latitude columns.  In this article, I'll show you how to do this!

 

Let's get started!

 

 

Step 1: Create your hierarchy

 

Let's begin by generating the sample data for our example.  As I mentioned before, the source data for the report shown above contains the longitude and latitude for a set of 805 individual US zip codes.  I created this data from the sashelp.zipcode data set.  To recreate the example source data, first we want to isolate the individual coordinates for each zip code.  We will also want to convert the zip code to a character variable so we can use it as a geography data type in our report.  This is easily done via the PROC SQL statement below:

 

proc sql;
	create table getZips as select
		put(zip,z5.) as zipcode, x, y
	from sashelp.zipcode;
quit;
A few observations from the output data set getZips are shown below:
 

 

Mike_Drutar_0-1593698455110.png

 

This three column data set will be the starting data for our example.  Now that we have the our zip codes with their respective coordinates we can begin the process of creating our hierarchy of varying levels of precision using the latitude and longitude variables.  What we need to do is take each observation and round the x and y columns to the level of precision we want.  After that, we will need to create a character variable that can serve as the category column for the geography data level.  Finally, we will add a label to our new category column. 

 

Since we want to repeat this process for each level of precision, I have wrapped the steps listed above into a macro named "buildHierarchy".  When I call the macro, I'm using it to create a hierarchy with the levels of precision at:

 

  1. Large City or District - 0.1
  2. Town or Village - 0.01
  3. Neighborhood or Street - 0.001

 

data buildHierarchy;
	set getZips;

%macro buildHierarchy(level,prec,var,label);
	lat_&var = round(y,&prec.);
	long_&var = round(x,&prec.);
	&var = "&var: Lat:" || compress(lat_&var.) || " x " || "Lon: " || compress(long_&var.);
	label &var = "&label";
%mend;

%buildHierarchy(00,.1,City,Large City or District)
%buildHierarchy(01,.01,Town,Town or Village)
%buildHierarchy(02,.001,Neighborhood,Neighborhood or Street)
;
run;

A few observations from the output data set buildHierarchy are shown below:

 

CreatedData.png

 

For this article, I've chosen to use observations where the frequency of the "city" variable is 35 or more.  This step is completely optional, however if you would like to subset your data as I did, run the PROC SQL and DATA STEP code below:

 

proc sql;
	create table getTopCities as
		select *, count(City) as totalCityObs
			from buildhierarchy 
				group by City
					having (calculated totalCityObs) > 35;
quit;

data buildhierarchy;
	set getTopCities;
	drop totalCityObs;
run;

Great!  Our data is now ready for SAS Visual Analytics!

 

Step 2: Build the report!

 

Begin by creating a new Visual Analytics report and add the data set 'buildhierarchy' to your report.

 

We will first need to convert our character columns into geographies.  Let's start with the "Large City or District" column.  From the data menu select this column and change its category to "Geography":

 

CityChangeClassification.png

 

At the next screen, for "Geography data:" select "Latitude and longitude in data".  For the Latitude and Longitude mappings choose:

 

  • Latitude (y): lat_City
  • Longitude (x): long_City

Press "OK".

 

CityChangeMapGeos.png

 

Great!  Now repeat the process for the other categorical columns.  Use the latitude and longitude mappings below:

 

  • Town or Village
    • Latitude (y): lat_Town
    • Longitude (x): long_Town
  • Neighborhood or Street
    • Latitude (y): lat_Neighborhood
    • Longitude (x): long_Neighborhood
  • zipcode
    • Latitude (y): Latitude (degrees) of the center (centroid) of ZIP Code
    • Longitude (x): Longitude (degrees) of the center (centroid) of ZIP Code.

With our geography data items in place, we can now create our hierarchy.  From the data menu, click "New data item" and select "Hierarchy":

 

CreateHierarchy.png

 

Name your hierarchy "CustomGeoHierarchy" and add the columns in the order shown below.

 

Press OK:

 

CreateHierarchy01.png

*Notice I have chosen not to use the "Neighborhood or Street" column in my hierarchy.  This is simply a design choice I made.  You can include it in your hierarchy if you like.

 

Great!  Now we can build our map!  Drag your new hierarchy onto the reporting canvas.  Since the hierarchy is made up of geographic data items a GeoMap report object will be automatically created:

 

CreateMap.png

 

By default, Visual Analytics 8.5 will create a coordinates map for you.  Since we want to point out which areas have the most zip codes we will need to change this to a Bubble Geo Map.  To do this, click the map object and select the "Options" menu on the right handed side of the screen.  From within this menu, scroll down until you see the "Coordinate" section and select "Bubble":

 

MakeBubblePlot.png

 

Great!  All that's left is to do some window dressing.  I decided to give my map a title of "Custom GeoHierarchy".  I have also removed the legend.

 

With these final edits in place we have successfully created our report!  Now that we created our custom geographic columns in SAS and used them in a hierarchy in Visual Analytics, a user can drill down to detailed data in just two clicks!

 

CustomHierarchy.gif

 

This example was created in SAS Visual Analytics 8.5.  However the concepts from this article will work in any version of SAS Visual Analytics.  Also, if you are using SAS Visual Analytics 8.5 you could also consider using the "Cluster adjacent markers" option within a coordinate geo map.  You can learn more about this options in the SAS Visual Analytics 8.5 documentation.

 

How to make this example work for you

 

This example was created in SAS Visual Analytics 8.5.  Again, for this post's data, I used the zipcode data set which is available in the sashelp library.

 

The attachments for this post are:

  • The ETL which creates the final buildhierarchy.sas7bdat dataset - buildhierarchy.sas

 

 
 

 

Version history
Last update:
‎07-06-2020 03:07 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags