I have a constructed a reflexive join but I want to limit some of the process to reduce overhead and get back only the information I need/want.
data test;
infile datalines delimiter = ',';
input serial lat long date;
format date date9.;
datalines;
1, 58.6732, 118.2569, 20225
1, 58.6733, 118.2568, 20226
1, 58.9865, 118.3156, 20227
1, 75.259, -52.0023, 20228
1, 75.2689, -52.125, 20229
2, 10.123, 15.256, 20111
2, 10.2365, 14.2698, 20112
2, 11.3569, 15.3489, 20113
3, 23.2569, 0.2659, 20256
3, 22.6789, 0.2788, 20257
3, 12.3569, 35.126, 20258
3, 12.3789, 34.679, 20259
;
run;
data test;
set test;
i = _n_;
run;
%let rad = constant('pi') / 180; /*degrees to radians*/
%let r_km = 6371; /*radius of the earth in km (3959 miles)*/
proc fcmp outlib = work.funcs.haversine;
function haversine(lat1, long1, lat2, long2); /*function name*/
/*function definition*/
*dist = &r_km * 2 * arsin(min(1, sqrt(sin((lat2 - lat1) * &rad / 2)**2 +
cos(lat1 * &rad) * cos(lat2 * &rad) * sin((long2 - long1) * &rad / 2)**2)));
/*asin should be equiv to atan2*/
difflat = lat2 - lat1;
difflong = long2 - long1;
arg = sin(difflat / 2 * &rad)**2 + cos(lat1 * &rad) * cos(lat2 * &rad) * sin(difflong / 2 * &rad)**2;
dist = 2 * &r_km * atan2(sqrt(arg), sqrt(1 - arg));
return(dist); /*what to return*/
endsub;
run;
options cmplib = work.funcs; /*needed to access the function*/
proc sql _method;
create table results (drop = sn date1 j)
as select
a.*,
b.serial as sn label = 'sn',
b.lat as lat1 label = 'lat1',
b.long as long1 label = 'long1',
b.date as date1 label = 'date1',
b.i as j,
haversine(a.lat, a.long, b.lat, b.long) as hdist
from test as a
inner join test as b
on a.serial = b.serial and a.i > b.i;
quit;
I added a.i > b.i to remove the symmetric results.
How could I trim the cases down to just, for serial 1, dist of 1 with 2, 2, with 3, .... and not run 1 with 3, 1 with 4, etc.
I can't tell quite what exact values you are subsetting but the generall approach would be to replace
inner join test as b
with something like
inner join (select your variables go here from test where serial=1 and dist=1) as b
or other conditions on the where. If they are complex enough it may be worth subsetting prior to the join to keep that code cleaner.
on a.serial = b.serial and a.i = a.serial and b.i = a.i+1
Your solution only returns one item in the table which is not what I was asking.
What is "the information I need/want" ? Please post the expected result corresponding to your example data.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.