- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I’m working on a bit of a complicated project. I have school information data that shows the total number of math students in a given year for multiple schools. What I want to do is look at each school for each year and determine the number of math students within a 50 mile and 100 mile radius of the school. I can use the geocode function to attain the longitude and latitude of each school. Geodist will also work to add in the distance. Geodist is limited though because it only works using 1 starting point and end point at a time. I’m not sure how to go about this as I have multiple starting points and so many possible end points.
So ideally, in order to do this, I would have to calculate the distance from school A to school B&C. Then do the same for school B to all the other schools. Then school C to every school. Once I've gone through every possible combination, I can create a field that keeps only the schools with distances that are less than or equal to 50 miles. Once I have the schools that are within 50 miles of school A, I can then sum all their students to get the number of math students within 50 miles. This seems tedious to do for each school since I have so many and would have to redo it for the 100 mile radius.
Is there a better approach?
Attached is sample data and below is a sample of my desired output.
Any help is highly appreciated.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql;
create table distance_between_schools as
select t1.schoolname as SchoolStart, t2.schoolName as SchoolEnd, geodist(t1.lat, t2.lat, t1.long, t2.long) as distance
from school_list as t1
cross join school_list as t2
where t1.school_name > t2.school_name
order by 1,2,3;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql;
create table distance_between_schools as
select t1.schoolname as SchoolStart, t2.schoolName as SchoolEnd, geodist(t1.lat, t2.lat, t1.long, t2.long) as distance
from school_list as t1
cross join school_list as t2
where t1.school_name > t2.school_name
order by 1,2,3;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proc GINSIDE to the rescue, if you have access to it.
Description from the documentation:
The GINSIDE procedure compares a data set of X and Y coordinates to a map data set containing map polygons. The procedure determines whether the X and Y points fall inside or outside of the map polygons.
So the data set of X and Y coordinates would be the location of the students (or of the contributing school, it isn't very clear from your description)
The map data set would be a set of coordinates around the school of interest. Some trig and data step to cycle through a do loop calculating the boundaries of a circle of the desired radius should work. Two circles would be two different boundaries with a different Identifier.
However, if you would have multiple schools that would have the radius overlap you may have some additional work.
I suspect this might work best with a separate data set for each school / year combination for the "center" school.
I don't open XLSX files from unknown sources. So won't go into any code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have SAS VA and ArcGIS option it may also do some of this.
QGIST
https://www.qgistutorials.com/en/docs/nearest_neighbor_analysis.html
The step you're on now would be referred to as "nearest neighbor analysis" IMO.