BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nianhui
Calcite | Level 5

Hello,

 

I'm geocoding <100 addresses using PROC GEOCODE with downloaded 2024 Steet Lookup Data for 9.4.

 

The output showed good street match, but many of the coordinates (X and Y) are off, way off.  Here are two output examples (matched addressed are not exactly the input addresses)

 

Matched address: "3003 Sevierville Rd, Maryville, TN 37804"  with X=-92.68680269 and Y=34.458537714

(This coordinates fall into Arkansas)

 

Matched address: "206 Debbie Ann Dr, Leander, TX 78641"  with X=-80.1951071 and Y=39.556943449

(This coordinates fall into West Virginia)

 

I wonder if anyone has used these lookup data and run into similar problems.

 

Thanks!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
MarciaS
SAS Employee

Hi @nianhui 

Your USP look up data set has fewer observations than it should.

 

The ReadMe.txt file in the downloaded .zip file states that USP should have 323,712,810 obserations. Your PROC CONTENTS output shows the USP data set containing 288,425,236 observations. 

 

Did you receive any errors when running the ImportCSVfiles.sas program to create the lookup data?

Try running the ImportCSVfiles.sas program again to recreate the lookup data and verify that the USP data set has the expected number of observations then try running PROC GEOCODE again. 

 

I hope that helps. 

Regards,
Marcia

View solution in original post

10 REPLIES 10
ballardw
Super User

Can you post the output of Proc Contents on the street lookup data set used and the Proc Geocode syntax you used?

 

If you used a SAS Map data set of some flavor it may be that the X and Y coordinates returned are Map coordinates and not latitude and longitude OR you are getting Lat and Long coordinates that are treated as X, Y map display pairs which would tend to display on a map incorrectly.

nianhui
Calcite | Level 5

 

Thanks!

 

Please see attached for the output of proc contents on the street lookup data sets.

 

Below is proc geocode syntax I used.

 

Libname streets '//i110filesmb.hs.it.vumc.io/SASUSER/sasuser/nianh/margaret/geocodedata_2024_StreetLookupData_94';
proc geocode
method=STREET
data=forgeocode
out=outgeocode
lookupstreet=streets.usm
attribute_var=(BLKGRP);
run;

 

 

 

 

 

ballardw
Super User

I think the issue may be the name of the state variable in the look up data Look at the help for the LOOKUPSTATEVAR. The default if the option is not set is to use a STATECODE variable. It looks like variable in the USM data set is MapIDNameAbrv.  So try adding  to the options in the Geocode syntax:

 

Lookupstatevar = Mapidnameabrv

So you may be getting results for similar City and Address values in different states.

 

The Geocode procedure defaults to a lot of variable names so it is worth checking if anything goes wrong.

I never had actual street lookup needs so I'm not sure how the procedure might complain if expected variable names aren't met. Did the LOG show anything that might be interpreted as expected  variable not present?

nianhui
Calcite | Level 5

 

Thanks! I think I used the correct variable names in the input data set: "address", "city", "state" and "zip". At first, I used "postal", and the log showed an error about the variable name.

 

I tried add 

 

Lookupstatevar = Mapidnameabrv

 

but got an error message

 

ERROR: Variable MAPIDNAMEABRV not found in MAPSGFK.USCITY_ALL data set.

 

 

Tom
Super User Tom
Super User

You did not provide the details about the variables in your INPUT datasets, just the map datasets.

nianhui
Calcite | Level 5

Please see attached for the contents of input address data.

 

Thanks!

MarciaS
SAS Employee

Hi @nianhui 

Your USP look up data set has fewer observations than it should.

 

The ReadMe.txt file in the downloaded .zip file states that USP should have 323,712,810 obserations. Your PROC CONTENTS output shows the USP data set containing 288,425,236 observations. 

 

Did you receive any errors when running the ImportCSVfiles.sas program to create the lookup data?

Try running the ImportCSVfiles.sas program again to recreate the lookup data and verify that the USP data set has the expected number of observations then try running PROC GEOCODE again. 

 

I hope that helps. 

Regards,
Marcia

nianhui
Calcite | Level 5

Thank you very much! Problem solved!

 

The log file didn't officially give me any error, but I looked again and there was something wrong. Please see attached log_ImportCSVfiles (1~3). The observation number of usp file is 288425236.

 

Then I started from scratch and re-downloaded everything (attached log_ImportCSVfiles_new and contents_usp_new). Now the observation number is 323712820.

 

All the coordinates are correct now!

 

Thanks again!!

 

 

 

 

 

 

 

 

 

 

 

Tom
Super User Tom
Super User

Your first log does have an indication of an problem.

Screenshot 2026-01-13 at 3.32.59 PM.png

Looks like some part of at least on the files was replaced with binary zeros instead of normal lines of text.

nianhui
Calcite | Level 5
It didn't give me a red error message, so I didn't look the log carefully at first. I thought this was just some normal things related to importing enormous data set.....
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1639 views
  • 0 likes
  • 4 in conversation