How to calculate geodesic distance in SAS

Question

I am calculating pairwise geodesic distances between street address using latitude and longitude. I would like to use "proc distance" to automatically do this for me. Does anyone know if geodesic distance can be calculated using proc distance?

data have;
format latitude 8.5;
format longitude 8.5;
   input Application_id $10. Latitude Longitude;
   datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;

Here's the output I want:

dist_1000000999	dist_1000000888	dist_1000000777	dist_1000000666	dist_1000000555	dist_1000000444	id_variable
0						dist_1000000999
11391.48746	0					dist_1000000888
0.898135642	11391.48746	0				dist_1000000777
0.718508521	11391.48746	0.179627134	0			dist_1000000666
410.9346807	11738.41243	411.3264781	411.2480465	0		dist_1000000555
2.081141049	11393.5686	2.266323028	2.201452738	409.062496	0	dist_1000000444

Answer

Here's an approach that doesn't use PROC DISTANCE, but uses the GEODIST function instead. It doesn't produce a matrix in the form that you asked for, but you can easily manipulate the output to get to that.

data have;
input id $10. latitude longitude;
datalines;
1000000999 -75.0000 50.0000
1000000888 -90.0000 30.0000
1000000777 -75.0000 50.0500
1000000666 -75.0000 50.0400
1000000555 -80.0000 35.0000
1000000444 -75.0300 50.0000
;
 
data want;
format latitude longitude la2 lo2 10.5 distance 15.5;
set have nobs=howmany;
do j=_n_+1 to howmany;
   set have (rename=(id=id2 latitude=la2 longitude=lo2)) point=j; 
   distance = geodist(latitude,longitude,la2,lo2,'m');
   output;
end;
run;

Here are the output records of data set WANT ... ID ID2 and distance in miles:

1 1000000999 1000000888 11391.48746
2 1000000999 1000000777 0.89814
3 1000000999 1000000666 0.71851
4 1000000999 1000000555 410.93468
5 1000000999 1000000444 2.08114
6 1000000888 1000000777 11391.48746
7 1000000888 1000000666 11391.48746
8 1000000888 1000000555 11738.41243
9 1000000888 1000000444 11393.56860
10 1000000777 1000000666 0.17963
11 1000000777 1000000555 411.32648
12 1000000777 1000000444 2.26632
13 1000000666 1000000555 411.24805
14 1000000666 1000000444 2.20145
15 1000000555 1000000444 409.06250

If you really want a matrix with those zeroes in the diagonal, first change the loop in the data step as follows ...

from ... do j=_n_+1 to howmany;

to ... do j=_n_ to howmany;

then use PROC TRANSPOSE to flip the data into matrix form:

proc transpose data=want out=want (drop=_name_) prefix=d;
var distance;
id id2;
by id notsorted;
run;

and you get ...

Obs id d1000000999 d1000000888 d1000000777 d1000000666 d1000000555 d1000000444

1     1000000999         0.00000     11391.48746         0.89814         0.71851       410.93468         2.08114
2     1000000888          .              0.00000     11391.48746     11391.48746     11738.41243     11393.56860
3     1000000777          .               .              0.00000         0.17963       411.32648         2.26632
4     1000000666          .               .               .              0.00000       411.24805         2.20145
5     1000000555          .               .               .               .              0.00000       409.06250
6     1000000444          .               .               .               .               .              0.00000

PGStats · ‎01-04-2016

For large distances (>1000km) this is probably the best approach. But for shorter distances, it is cheaper to convert the coordinates to UTM and to calculate Euclidian distances with proc distance.

Ideally proc distance should support angular distances...