Hello. I have code below for calculating the shortest distance between a patient's zip code and their nearest specialty medical provider. The problem is that while I have patient zip codes, I only have city and state for a long list of medical providers. I'd be grateful for help with finding the shortest distance between a patient's zip code and the city of their nearest medical provider. I also figure that this code is probably not ideal, as I have city/state for over 400 doctors. Any suggestions for how to do this more efficiently than this?
data PatientZipDoctorCityState;
set PatientZipDoctorCityState;
if not missing(PatientZip) then PatientZipCode=put(PatientZip,$z5.);
Doctor1 =put(00000,$z5.);
Doctor2 =put(00000,$z5.);
Doctor3 =put(00000,$z5.);
Doctor4 =put(00000,$z5.);
Doctor5 =put(00000,$z5.);
Doctor6 =put(00000,$z5.);
Doctor7 =put(00000,$z5.);
Doctor8 =put(00000,$z5.);
Doctor9 =put(00000,$z5.);
if not missing(PatientZipCode) then distance=zipcitydistance(PatientZipCode,Doctor1);
array calc[9] Doctor1-Doctor9;
do i=1 to 9;
if not missing(PatientZip)
then distance=min(distance, zipcitydistance(PatientZip,calc(i))) ;
end;
drop Doctor1-Doctor9 i;
run;
Depending upon what (if anything) you want to output when you can't calculate distance, proc sql may be easier code to maintain.
e.g., the following will only output a record if there is a non-missing minimum distance:
data PatientZip; do zip=15101,15100,15116,15123,15102,15122,15123,15102,15110,15112,15116,15120; Patient+1; output; end; run; data DoctorCityState; input doctor zip; cards; 1 15101 2 15102 3 15104 4 15106 5 15108 6 15210 7 15212 8 15216 9 15220 ; proc sql; create table PatientZipDoctorCityState as select a.patient, a.zip, zipcitydistance(a.Zip,b.zip) as distance from PatientZip a, DoctorCityState b group by patient having calculated distance eq min(calculated distance) and not missing(calculated distance) ; quit;
Of course it can also be done using an array, but I wouldn't do it the way you did. Absolutely no reason to keep all of the doctor ids/names and zips for each patient. However, I won't suggest a modification if you decide to go with the sql method instead.
Art, CEO, AnalystFinder.com
This provides exactly what I need, which is the distance between the patient's zip code and the nearest doctor. But do you know of a way to do this with city and state data instead of zip codes? I only have city and state for each doctor, but I have zip code for each patient.
For each city, either take a representative ZIP that works for you or find all zips allocated to that city and use an average of the latitude/longitude instead. You're going to have make assumptions or some rule, so just make it consistent and if you have GIS skills and/or are working with a small area make a map so you understand what those assumptions may mean.
If you have lat/long for zips you can use GEODIST() instead.
SAS has several files with the lat/long for cities as well as the lat/long for zip codes.
They're in the SASMAPS and GFSK libraries these days.
Not sure how you would select a centralized zip for a large city. Here is a start, at least, for getting zip from city and state:
/* create DoctorCityState for testing purposes; */ data need(drop=wantzip); input doctor wantzip; city=scan(zipcity(wantzip),1,','); state=zipstate(wantzip); cards; 1 15101 2 15102 3 15104 4 15106 5 15108 6 15210 7 15212 8 15216 9 15220 ; proc sql; create table DoctorCityState as select a.doctor,a.city,a.state,b.zip from DoctorCityState a, sashelp.zipcode b group by a.doctor,b.city,b.state having upcase(a.city) eq upcase(b.city) and upcase(a.state) eq upcase(b.statecode) and b.zip eq min(b.zip) ; quit; /* END: create DoctorCityState for testing purposes; */
Art, CEO, AnalystFinder.com
See if you have access to SAS/GIS with a road map set. This is really a GIS problem as a short distance may equal a long drive time. I can show you some 15 mile drives that are much longer to drive than 30 or 40 miles of freeway.
Alternatively this might be a linear programming program if you have a database of from/to coordinates with travel time/distance information. If you have, or can find such information, that could be a SAS/OR problem.
Do you have the street address of the clinic, if not the ZIP? If so, SAS has a geocoding option that will geocode addresses.
If you search LexJansen.com there's some papers on using Google to get driving distances instead of distances 'as the crow flies'.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.