BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lobbie
Obsidian | Level 7

Hi,

 

I have a SAS dataset with about 200K records containing TranID, Lats and Longs.  Anyone know how can I do reverse geocoding to get the Zip Codes?  I found something about a dataset in SAS Help library but it returned only the Counties and the lats and longs need to be centoids.

 

Thanks,

Will

1 ACCEPTED SOLUTION

Accepted Solutions
rogerjdeangelis
Barite | Level 11
WPS/SAS/R: Given Latitude and Longitude get postal_code and other geography

If you have IML interface with R you can just paste the code below

The free express edition of WPS does not limit the size of the SAS dataset
produced by R

Package also does geocoding

HAVE
====

Up to 40 obs from sd1.have total obs=1

Obs       LON        LAT

 1     -73.9738    40.7625

WANT
===

Up to 40 obs WORK.RESWPSXPO total obs=11

Obs    NAME                           VALUE

 11    POSTAL_CODE                    10022

  1    ADDRESS                        Trump, 725 5th Ave, New York, NY 10022, USA
  2    PREMISE                        Trump
  3    STREET_NUMBER                  725
  4    ROUTE                          5th Avenue
  5    NEIGHBORHOOD                   Midtown
  6    POLITICAL                      Manhattan
  7    LOCALITY                       New York
  8    ADMINISTRATIVE_AREA_LEVEL_2    New York County
  9    ADMINISTRATIVE_AREA_LEVEL_1    New York
 10    COUNTRY                        United States


WORKING CODE
============

   WPS/R

    res <- revgeocode(lonlat_sample, output="more");

  You can also geocode

    geo <- geocode("725 fifth avenue, New York NY")
    yeild see code at end

       [1] -73.97378  40.76245
       [1] 10022
       [1] Midtown

FULL SOLUTION
=============

%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(haven);
library(ggmap);
lonlat_sample<-as.numeric(read_sas("d:/sd1/have.sas7bdat"));
res <- revgeocode(lonlat_sample, output="more");
endsubmit;
import r=res data=wrk.reswps;
run;quit;
');

proc transpose data=reswps out=reswpsxpo;
var _all_;
run;quit;


p to 40 obs from reswps total obs=1

                                                          STREET_
bs                    ADDRESS                    PREMISE  NUMBER     ROUTE

1   Trump, 725 5th Ave, New York, NY 10022, USA   Trump     725    5th Avenue


                                   ADMINISTRATIVE_  ADMINISTRATIVE_                 POSTAL
NEIGHBORHOOD  POLITICAL  LOCALITY   AREA_LEVEL_2     AREA_LEVEL_1       COUNTRY      CODE

  Midtown     Manhattan  New York  New York County     New York      United States   10022

%utl_submit_r64('
source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
library("ggmap");
lonlat_sample <- as.numeric(geocode("725 fifth avenue, New York NY"));
class(lonlat_sample);
lonlat_sample;
res <- revgeocode(lonlat_sample, output="more");
res$postal_code;
res$neighborhood;
');

View solution in original post

8 REPLIES 8
Reeza
Super User

Look at PROC GINSIDE. 

 

 

 

LaurieF
Barite | Level 11

You'll need to get hold of a shape file, then run proc ginside to do the point-in-polygon processing. I did this successfully a few years ago at NZ Police to get meshblocks, which could be rolled up to larger areas groups. 

 

These shape files contain all the points bounding the area, which can in fact be multiple shapes. Read in the shape file via proc mapimport, then run over the generated dataset with your 200k rows with ginside, and the counties will automagically be included.

 

Often the shape files can be very wide, with all the higher grouping codes and texts, but you can drop those out with a datastep after the importing. It's vital to retain the internal order of the shape file - don't drop observations, just variables.

 

As an indication, a shape file with forty-odd million rows, against a dataset with 1.6 million locations, took around 3-4 hours to run initially. Because the location dataset was being incrementally updated, the following runs on unique locations could zip through (puns always intended) in 3-4 minutes.

art297
Opal | Level 21

I only have access to SAS UE, so can't test any of this.

 

However, I think that the following articles are what you're looking for:

 

The first one shows you how to create a zipcode map
http://support.sas.com/resources/papers/proceedings13/214-2013.pdf

 

The second one shows you how to obtain the zipcodes from that map, based on only having a pair of grid coordinates, using proc ginside: http://support.sas.com/documentation/cdl/en/graphref/65389/HTML/default/viewer.htm#p0qjcc8hugcjb2n1x...

 

Art, CEO, AnalystFinder.com

 

rogerjdeangelis
Barite | Level 11
WPS/SAS/R: Given Latitude and Longitude get postal_code and other geography

If you have IML interface with R you can just paste the code below

The free express edition of WPS does not limit the size of the SAS dataset
produced by R

Package also does geocoding

HAVE
====

Up to 40 obs from sd1.have total obs=1

Obs       LON        LAT

 1     -73.9738    40.7625

WANT
===

Up to 40 obs WORK.RESWPSXPO total obs=11

Obs    NAME                           VALUE

 11    POSTAL_CODE                    10022

  1    ADDRESS                        Trump, 725 5th Ave, New York, NY 10022, USA
  2    PREMISE                        Trump
  3    STREET_NUMBER                  725
  4    ROUTE                          5th Avenue
  5    NEIGHBORHOOD                   Midtown
  6    POLITICAL                      Manhattan
  7    LOCALITY                       New York
  8    ADMINISTRATIVE_AREA_LEVEL_2    New York County
  9    ADMINISTRATIVE_AREA_LEVEL_1    New York
 10    COUNTRY                        United States


WORKING CODE
============

   WPS/R

    res <- revgeocode(lonlat_sample, output="more");

  You can also geocode

    geo <- geocode("725 fifth avenue, New York NY")
    yeild see code at end

       [1] -73.97378  40.76245
       [1] 10022
       [1] Midtown

FULL SOLUTION
=============

%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(haven);
library(ggmap);
lonlat_sample<-as.numeric(read_sas("d:/sd1/have.sas7bdat"));
res <- revgeocode(lonlat_sample, output="more");
endsubmit;
import r=res data=wrk.reswps;
run;quit;
');

proc transpose data=reswps out=reswpsxpo;
var _all_;
run;quit;


p to 40 obs from reswps total obs=1

                                                          STREET_
bs                    ADDRESS                    PREMISE  NUMBER     ROUTE

1   Trump, 725 5th Ave, New York, NY 10022, USA   Trump     725    5th Avenue


                                   ADMINISTRATIVE_  ADMINISTRATIVE_                 POSTAL
NEIGHBORHOOD  POLITICAL  LOCALITY   AREA_LEVEL_2     AREA_LEVEL_1       COUNTRY      CODE

  Midtown     Manhattan  New York  New York County     New York      United States   10022

%utl_submit_r64('
source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
library("ggmap");
lonlat_sample <- as.numeric(geocode("725 fifth avenue, New York NY"));
class(lonlat_sample);
lonlat_sample;
res <- revgeocode(lonlat_sample, output="more");
res$postal_code;
res$neighborhood;
');

rogerjdeangelis
Barite | Level 11
Forgot the input SAS dataset

options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.have;
lon=-73.97378;
lat=40.76245;
run;quit;


Lobbie
Obsidian | Level 7

Thank you all and especially to @rogerjdeangelis

Lobbie
Obsidian | Level 7

Hi @rogerjdeangelis,

 

I am getting the following error when trying out your solution in WPS.  Any clues?

 

2017-04-09_11-25-49.png

Thanks,

Will

rogerjdeangelis
Barite | Level 11

Check out

 

http://blogs.sas.com/content/iml/2011/05/13/calling-r-from-sasiml-software.html

 

or 

 

use the excellect SAS help facilities.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3291 views
  • 8 likes
  • 5 in conversation