I am calculating pairwise geodesic distances between street address using latitude and longitude. I would like to use "proc distance" to automatically do this for me. Does anyone know if geodesic distance can be calculated using proc distance?
data have;
format latitude 8.5;
format longitude 8.5;
input Application_id $10. Latitude Longitude;
datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;
Here's the output I want:
dist_1000000999 | dist_1000000888 | dist_1000000777 | dist_1000000666 | dist_1000000555 | dist_1000000444 | id_variable |
0 | dist_1000000999 | |||||
11391.48746 | 0 | dist_1000000888 | ||||
0.898135642 | 11391.48746 | 0 | dist_1000000777 | |||
0.718508521 | 11391.48746 | 0.179627134 | 0 | dist_1000000666 | ||
410.9346807 | 11738.41243 | 411.3264781 | 411.2480465 | 0 | dist_1000000555 | |
2.081141049 | 11393.5686 | 2.266323028 | 2.201452738 | 409.062496 | 0 | dist_1000000444 |
Here's an approach that doesn't use PROC DISTANCE, but uses the GEODIST function instead. It doesn't produce a matrix in the form that you asked for, but you can easily manipulate the output to get to that.
data have;
input id $10. latitude longitude;
datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;
data want;
format latitude longitude la2 lo2 10.5 distance 15.5;
set have nobs=howmany;
do j=_n_+1 to howmany;
set have (rename=(id=id2 latitude=la2 longitude=lo2)) point=j;
distance = geodist(latitude,longitude,la2,lo2,'m');
output;
end;
run;
Here are the output records of data set WANT ... ID ID2 and distance in miles:
1 1000000999 1000000888 11391.48746
2 1000000999 1000000777 0.89814
3 1000000999 1000000666 0.71851
4 1000000999 1000000555 410.93468
5 1000000999 1000000444 2.08114
6 1000000888 1000000777 11391.48746
7 1000000888 1000000666 11391.48746
8 1000000888 1000000555 11738.41243
9 1000000888 1000000444 11393.56860
10 1000000777 1000000666 0.17963
11 1000000777 1000000555 411.32648
12 1000000777 1000000444 2.26632
13 1000000666 1000000555 411.24805
14 1000000666 1000000444 2.20145
15 1000000555 1000000444 409.06250
If you really want a matrix with those zeroes in the diagonal, first change the loop in the data step as follows ...
from ... do j=_n_+1 to howmany;
to ... do j=_n_ to howmany;
then use PROC TRANSPOSE to flip the data into matrix form:
proc transpose data=want out=want (drop=_name_) prefix=d;
var distance;
id id2;
by id notsorted;
run;
and you get ...
Obs id d1000000999 d1000000888 d1000000777 d1000000666 d1000000555 d1000000444
1 1000000999 0.00000 11391.48746 0.89814 0.71851 410.93468 2.08114
2 1000000888 . 0.00000 11391.48746 11391.48746 11738.41243 11393.56860
3 1000000777 . . 0.00000 0.17963 411.32648 2.26632
4 1000000666 . . . 0.00000 411.24805 2.20145
5 1000000555 . . . . 0.00000 409.06250
6 1000000444 . . . . . 0.00000
For large distances (>1000km) this is probably the best approach. But for shorter distances, it is cheaper to convert the coordinates to UTM and to calculate Euclidian distances with proc distance.
Ideally proc distance should support angular distances...
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.