I need to take a massive file (20 million + observations) that already has lat/long coordinates and assign those to census tracts (to get the FIPS codes). Is there a way to do this in SAS? I'm doing it in ArcGIS and it's very slow and I think I'm on track for how to do it in R, but I'd like to know if I can do the same thing in SAS. I've never used Proc GIS or Proc Geocode.
Data basically looks like this:
Site Lat Long Value
1 24.857 -80.372 9.7
2 24.743 -81.273 8.2
3 25.674 -80.813 10.3
And I want something like this:
Site Lat Long Value CT_FIPS
1 24.857 -80.372 9.7 12381857463
2 24.743 -81.273 8.2 12381857253
3 25.674 -80.813 10.3 12382006321
Thanks!
The following will require a license for SAS/Graph as all of the map related procedures are part of that module. SAS/GIS would likely work as well but I haven't actually used that to have any details of the steps.
If you have a Shape file with the census tract boundaries you should be able to use Proc MAPIMPORT to read it into a SAS map data set.
Hopefully you can find which projection technique was used to create the shape file as that may be needed for the next step.
You will likely need to convert your lat and long values to radians for comparson with the map boundary points. This can be done with the Gproject procedure.
The SAS procedure GINSIDE can determine if the coordinates are inside a polygon. You can request which variables from map data set, hopefully including the census tract identifiers are in the output data.
Thanks @ballardw! It looks like I do have SAS/Graph (plus I see something called SAS/Graph NV Workshop that I've never heard of/used).
I was able to import the tract boundaries. I can see the projection (Albers) and the projected coordinate system (USA Contiguous Albers Equal Area Conic) in ArcGIS. Based on that, I'm not sure what I need to do to my lat/long values (?).
I didn't try using GProject because I wasn't sure what to do for that, but tried GINSIDE and it keeps saying the DATA = dataset must have x and y variables, so I'm not sure if that has to do with the GProject step that I'm confused about.
@wernie wrote:
Thanks @ballardw! It looks like I do have SAS/Graph (plus I see something called SAS/Graph NV Workshop that I've never heard of/used).
I was able to import the tract boundaries. I can see the projection (Albers) and the projected coordinate system (USA Contiguous Albers Equal Area Conic) in ArcGIS. Based on that, I'm not sure what I need to do to my lat/long values (?).
I didn't try using GProject because I wasn't sure what to do for that, but tried GINSIDE and it keeps saying the DATA = dataset must have x and y variables, so I'm not sure if that has to do with the GProject step that I'm confused about.
All of the map related procedures are very sensitive as to the existence and type of certain variables. X, Y locations must be named that and must be numeric. X is the easting (long) and Y northing (lat). Note that GPROJECT has an option LATLONG to use variables named LAT and LONG plus an option DEGREES to indicate that the values are degrees. The option PROJECT=ALBERS should create similar points to your shapefile. You may also need the EASTLONG option. The resulting data set from Gproject should contain x y variables after projection. The units in your LAT LONG data are just plain incorrect for GINSIDE which uses the projected graphing coordinates. Print a few value from the imported map data and see what the X, Y values look like.
I would suggest picking a small number of points from your LAT LONG data to test with, especially if you know where they are, to test with.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.