I am calculating pairwise geodesic distances between street address using latitude and longitude. I would like to use "proc distance" to automatically do this for me. Does anyone know if geodesic distance can be calculated using proc distance?
data have;
format latitude 8.5;
format longitude 8.5;
input Application_id $10. Latitude Longitude;
datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;
Here's the output I want:
dist_1000000999 | dist_1000000888 | dist_1000000777 | dist_1000000666 | dist_1000000555 | dist_1000000444 | id_variable |
0 | dist_1000000999 | |||||
11391.48746 | 0 | dist_1000000888 | ||||
0.898135642 | 11391.48746 | 0 | dist_1000000777 | |||
0.718508521 | 11391.48746 | 0.179627134 | 0 | dist_1000000666 | ||
410.9346807 | 11738.41243 | 411.3264781 | 411.2480465 | 0 | dist_1000000555 | |
2.081141049 | 11393.5686 | 2.266323028 | 2.201452738 | 409.062496 | 0 | dist_1000000444 |
Here's an approach that doesn't use PROC DISTANCE, but uses the GEODIST function instead. It doesn't produce a matrix in the form that you asked for, but you can easily manipulate the output to get to that.
data have;
input id $10. latitude longitude;
datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;
data want;
format latitude longitude la2 lo2 10.5 distance 15.5;
set have nobs=howmany;
do j=_n_+1 to howmany;
set have (rename=(id=id2 latitude=la2 longitude=lo2)) point=j;
distance = geodist(latitude,longitude,la2,lo2,'m');
output;
end;
run;
Here are the output records of data set WANT ... ID ID2 and distance in miles:
1 1000000999 1000000888 11391.48746
2 1000000999 1000000777 0.89814
3 1000000999 1000000666 0.71851
4 1000000999 1000000555 410.93468
5 1000000999 1000000444 2.08114
6 1000000888 1000000777 11391.48746
7 1000000888 1000000666 11391.48746
8 1000000888 1000000555 11738.41243
9 1000000888 1000000444 11393.56860
10 1000000777 1000000666 0.17963
11 1000000777 1000000555 411.32648
12 1000000777 1000000444 2.26632
13 1000000666 1000000555 411.24805
14 1000000666 1000000444 2.20145
15 1000000555 1000000444 409.06250
If you really want a matrix with those zeroes in the diagonal, first change the loop in the data step as follows ...
from ... do j=_n_+1 to howmany;
to ... do j=_n_ to howmany;
then use PROC TRANSPOSE to flip the data into matrix form:
proc transpose data=want out=want (drop=_name_) prefix=d;
var distance;
id id2;
by id notsorted;
run;
and you get ...
Obs id d1000000999 d1000000888 d1000000777 d1000000666 d1000000555 d1000000444
1 1000000999 0.00000 11391.48746 0.89814 0.71851 410.93468 2.08114
2 1000000888 . 0.00000 11391.48746 11391.48746 11738.41243 11393.56860
3 1000000777 . . 0.00000 0.17963 411.32648 2.26632
4 1000000666 . . . 0.00000 411.24805 2.20145
5 1000000555 . . . . 0.00000 409.06250
6 1000000444 . . . . . 0.00000
For large distances (>1000km) this is probably the best approach. But for shorter distances, it is cheaper to convert the coordinates to UTM and to calculate Euclidian distances with proc distance.
Ideally proc distance should support angular distances...
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.