BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BrandonD
Fluorite | Level 6

I tried to look around the community and other resources for an answer on this topic, but failed to locate an answer on this specific question in SAS Visual Analytics.

 

The issue we're seeing is related to creating a zip code geo-variable on SAS Visual Analytics. Currently, we have several separate datasets that contain Zip Code columns, all of which seem to incorrectly match the shape-file when converting them into a geo-variable from a character variable. Here is some background information on the data:

 

  • Data is imported into CAS via ODBC connection to Database.
  • The column is imported as a character, not as a measure. This means that all values retain the leading zero into SAS VA. (See attached photo - "values not matched" section of edit prompt)
  • Our shape file correctly maps a vast majority, if not all zip codes that do not have leading 0's.
  • After investigating, we are certain it is the leading zeros. The proportion of non-matched values are always consistent with the proportion of zip codes having a leading zero.

It's also been observed that Puerto Rico seems to have issues displaying zip codes, as well as New Jersey. An example would be a hierarchy that drills down to Bergen County in NJ, but no zip codes are displayed due to leading zeros in each zip code for that county.

 

 

Our best hypothesis:

  1. We imported the shape file incorrectly, which led to those values not being matched in any case due to an improper column format in the shape file.
  2. The Zip Code values need to be in a proper format before conversion to Geo-variable.
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

What is the structure of the two datasets?

You need to either make the ZIPCODE variable numeric everywhere or character everywhere. If it is character then the values have to be consistent in how the digits are written into the character variable.  All with leading zeros. All without leading zeros and left aligned. All without leading zeros and right aligned.

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

What is the structure of the two datasets?

You need to either make the ZIPCODE variable numeric everywhere or character everywhere. If it is character then the values have to be consistent in how the digits are written into the character variable.  All with leading zeros. All without leading zeros and left aligned. All without leading zeros and right aligned.

BrandonD
Fluorite | Level 6

This led me to the right place - The shape file had a numerical zip and the database zip had a varchar zip. The zip codes did not have to be in the 5 digit format as I had originally suspected to successfully match. I'm still seeing around 93% match, but I'm assuming the rest are probably related to the same issue and can be fixed with an updated zip shapefile.

 

Thanks!

Tom
Super User Tom
Super User
If your original data had char ZIP code then you might have some non-numeric values. For example you might have zip+4 values like '12345-1234'.
Sam_SAS
SAS Employee

Hello Brandon,

 

Yes, leading zeros are probably the issue.

 

In VA, you can use string manipulation operators like Substring to get rid of the leading zeros.

 

I believe that you'll want to do something like this, to get just the last 5 characters of the ZIP value:

 

IF (getLength('zip'n) > 5)

RETURN Substring('zip'n, getLength('zip'n) - 4, getLength('zip'n)) 

ELSE 'zip'n

 

Does that help?

 

Thanks,

Sam

Sam_SAS
SAS Employee

On further review, I am wondering if your issue is that the leading zeros that are supposed to be there are missing. (PR zip codes start with 00, I believe).

In that case you might want to use the Concatenate operator to prepend '0' as needed until getLength returns 5.

BrandonD
Fluorite | Level 6
This was a good start for me but without having to do the adjustment via the calculated Item I used Parse to change the (database) varchar Zip to a NUMX and boom - length problem solved!
Sam_SAS
SAS Employee
Excellent! Thanks for following up

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2015 views
  • 3 likes
  • 3 in conversation