I have a constructed a reflexive join but I want to limit some of the process to reduce overhead and get back only the information I need/want.
data test;
infile datalines delimiter = ',';
input serial lat long date;
format date date9.;
datalines;
1, 58.6732, 118.2569, 20225
1, 58.6733, 118.2568, 20226
1, 58.9865, 118.3156, 20227
1, 75.259, -52.0023, 20228
1, 75.2689, -52.125, 20229
2, 10.123, 15.256, 20111
2, 10.2365, 14.2698, 20112
2, 11.3569, 15.3489, 20113
3, 23.2569, 0.2659, 20256
3, 22.6789, 0.2788, 20257
3, 12.3569, 35.126, 20258
3, 12.3789, 34.679, 20259
;
run;
data test;
set test;
i = _n_;
run;
%let rad = constant('pi') / 180; /*degrees to radians*/
%let r_km = 6371; /*radius of the earth in km (3959 miles)*/
proc fcmp outlib = work.funcs.haversine;
function haversine(lat1, long1, lat2, long2); /*function name*/
/*function definition*/
*dist = &r_km * 2 * arsin(min(1, sqrt(sin((lat2 - lat1) * &rad / 2)**2 +
cos(lat1 * &rad) * cos(lat2 * &rad) * sin((long2 - long1) * &rad / 2)**2)));
/*asin should be equiv to atan2*/
difflat = lat2 - lat1;
difflong = long2 - long1;
arg = sin(difflat / 2 * &rad)**2 + cos(lat1 * &rad) * cos(lat2 * &rad) * sin(difflong / 2 * &rad)**2;
dist = 2 * &r_km * atan2(sqrt(arg), sqrt(1 - arg));
return(dist); /*what to return*/
endsub;
run;
options cmplib = work.funcs; /*needed to access the function*/
proc sql _method;
create table results (drop = sn date1 j)
as select
a.*,
b.serial as sn label = 'sn',
b.lat as lat1 label = 'lat1',
b.long as long1 label = 'long1',
b.date as date1 label = 'date1',
b.i as j,
haversine(a.lat, a.long, b.lat, b.long) as hdist
from test as a
inner join test as b
on a.serial = b.serial and a.i > b.i;
quit;
I added a.i > b.i to remove the symmetric results.
How could I trim the cases down to just, for serial 1, dist of 1 with 2, 2, with 3, .... and not run 1 with 3, 1 with 4, etc.
I can't tell quite what exact values you are subsetting but the generall approach would be to replace
inner join test as b
with something like
inner join (select your variables go here from test where serial=1 and dist=1) as b
or other conditions on the where. If they are complex enough it may be worth subsetting prior to the join to keep that code cleaner.
on a.serial = b.serial and a.i = a.serial and b.i = a.i+1
Your solution only returns one item in the table which is not what I was asking.
What is "the information I need/want" ? Please post the expected result corresponding to your example data.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.