BookmarkSubscribeRSS Feed
wernie
Quartz | Level 8

I need to take a massive file (20 million + observations) that already has lat/long coordinates and assign those to census tracts (to get the FIPS codes). Is there a way to do this in SAS? I'm doing it in ArcGIS and it's very slow and I think I'm on track for how to do it in R, but I'd like to know if I can do the same thing in SAS. I've never used Proc GIS or Proc Geocode.

 

Data basically looks like this:

 

Site     Lat         Long         Value

1        24.857    -80.372     9.7

2        24.743    -81.273     8.2

3        25.674    -80.813     10.3

 

And I want something like this:

 

Site     Lat         Long         Value    CT_FIPS

1        24.857    -80.372     9.7       12381857463

2        24.743    -81.273     8.2       12381857253

3        25.674    -80.813     10.3     12382006321

 

Thanks!

5 REPLIES 5
ballardw
Super User

The following will require a license for SAS/Graph as all of the map related procedures are part of that module. SAS/GIS would likely work as well but I haven't actually used that to have any details of the steps.

 

If you have a Shape file with the census tract boundaries you should be able to use Proc MAPIMPORT to read it into a SAS map data set.

Hopefully you can find which projection technique was used to create the shape file as that may be needed for the next step.

You will likely need to convert your lat and long values to radians for comparson with the map boundary points. This can be done with the Gproject procedure.

 

The SAS procedure GINSIDE can determine if the coordinates are inside a polygon. You can request which variables from map data set, hopefully including the census tract identifiers are in the output data.

wernie
Quartz | Level 8

Thanks @ballardw! It looks like I do have SAS/Graph (plus I see something called SAS/Graph NV Workshop that I've never heard of/used).

 

I was able to import the tract boundaries. I can see the projection (Albers) and the projected coordinate system (USA Contiguous Albers Equal Area Conic) in ArcGIS. Based on that, I'm not sure what I need to do to my lat/long values (?).

 

I didn't try using GProject because I wasn't sure what to do for that, but tried GINSIDE and it keeps saying the DATA = dataset must have x and y variables, so I'm not sure if that has to do with the GProject step that I'm confused about.

ballardw
Super User

@wernie wrote:

Thanks @ballardw! It looks like I do have SAS/Graph (plus I see something called SAS/Graph NV Workshop that I've never heard of/used).

 

I was able to import the tract boundaries. I can see the projection (Albers) and the projected coordinate system (USA Contiguous Albers Equal Area Conic) in ArcGIS. Based on that, I'm not sure what I need to do to my lat/long values (?).

 

I didn't try using GProject because I wasn't sure what to do for that, but tried GINSIDE and it keeps saying the DATA = dataset must have x and y variables, so I'm not sure if that has to do with the GProject step that I'm confused about.


All of the map related procedures are very sensitive as to the existence and type of certain variables. X, Y locations must be named that and must be numeric. X is the easting (long) and Y northing (lat). Note that GPROJECT has an option LATLONG to use variables named LAT and LONG plus an option DEGREES to indicate that the values are degrees. The option PROJECT=ALBERS should create similar points to your shapefile. You may also need the EASTLONG option.  The resulting data set from Gproject should contain x y variables after projection. The units in your LAT LONG data are just plain incorrect for GINSIDE which uses the projected graphing coordinates. Print a few value from the imported map data and see what the X, Y values look like.

 

I would suggest picking a small number of points from your LAT LONG data to test with, especially if you know where they are, to test with.

 

Reeza
Super User
Run the example in the documentation first, from GINSIDE. Make sure it works and then start replacing the components step by step with your data and making sure it matches the structure of the input data. This is the easiest way to get it working IMO.
Reeza
Super User
ArcGIS or QGIS are likely to be the fastest, turn off display while doing the calculation/join though.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2006 views
  • 2 likes
  • 3 in conversation