I have a constructed a reflexive join but I want to limit some of the process to reduce overhead and get back only the information I need/want.
data test;
infile datalines delimiter = ',';
input serial lat long date;
format date date9.;
datalines;
1, 58.6732, 118.2569, 20225
1, 58.6733, 118.2568, 20226
1, 58.9865, 118.3156, 20227
1, 75.259, -52.0023, 20228
1, 75.2689, -52.125, 20229
2, 10.123, 15.256, 20111
2, 10.2365, 14.2698, 20112
2, 11.3569, 15.3489, 20113
3, 23.2569, 0.2659, 20256
3, 22.6789, 0.2788, 20257
3, 12.3569, 35.126, 20258
3, 12.3789, 34.679, 20259
;
run;
data test;
set test;
i = _n_;
run;
%let rad = constant('pi') / 180; /*degrees to radians*/
%let r_km = 6371; /*radius of the earth in km (3959 miles)*/
proc fcmp outlib = work.funcs.haversine;
function haversine(lat1, long1, lat2, long2); /*function name*/
/*function definition*/
*dist = &r_km * 2 * arsin(min(1, sqrt(sin((lat2 - lat1) * &rad / 2)**2 +
cos(lat1 * &rad) * cos(lat2 * &rad) * sin((long2 - long1) * &rad / 2)**2)));
/*asin should be equiv to atan2*/
difflat = lat2 - lat1;
difflong = long2 - long1;
arg = sin(difflat / 2 * &rad)**2 + cos(lat1 * &rad) * cos(lat2 * &rad) * sin(difflong / 2 * &rad)**2;
dist = 2 * &r_km * atan2(sqrt(arg), sqrt(1 - arg));
return(dist); /*what to return*/
endsub;
run;
options cmplib = work.funcs; /*needed to access the function*/
proc sql _method;
create table results (drop = sn date1 j)
as select
a.*,
b.serial as sn label = 'sn',
b.lat as lat1 label = 'lat1',
b.long as long1 label = 'long1',
b.date as date1 label = 'date1',
b.i as j,
haversine(a.lat, a.long, b.lat, b.long) as hdist
from test as a
inner join test as b
on a.serial = b.serial and a.i > b.i;
quit;
I added a.i > b.i to remove the symmetric results.
How could I trim the cases down to just, for serial 1, dist of 1 with 2, 2, with 3, .... and not run 1 with 3, 1 with 4, etc.
I can't tell quite what exact values you are subsetting but the generall approach would be to replace
inner join test as b
with something like
inner join (select your variables go here from test where serial=1 and dist=1) as b
or other conditions on the where. If they are complex enough it may be worth subsetting prior to the join to keep that code cleaner.
on a.serial = b.serial and a.i = a.serial and b.i = a.i+1
Your solution only returns one item in the table which is not what I was asking.
What is "the information I need/want" ? Please post the expected result corresponding to your example data.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.