BookmarkSubscribeRSS Feed
buffheman
Calcite | Level 5
I need some advice on how best to process this problem. I have a dataset of say 100,000 locations, with nonsequential IDs, a latitude and a longitude. How would I best go about finding, for each ID, the closest location to it. Don't worry about the actual calculation of distance, just wondering how best to process it. I don't have any formal SAS training and I 've never used arrays, so any tips would be appreciated. This is my current setup, it's certainly unwieldy, but I was rushed and didn't know how else to do it.

I created an observation number for sequencing called key and a macro variable obs to hold the total number of observations. Then I call a macro for each ID and its location. The macro uses retain statements to keep track of the closest distance so far, and loops through until the last observation, then outputs 1 row. So it's creating 100,000 1 row tables that I append together. Seems ridiculous. Thanks for your help.


%macro loop2;
%do i = 1 %to &obs;
data _null_;
set locations;
where key = &i;
call symput ('ID', ID);
call symput ('lat', latitude);
call symput ('long', longitude);
run;
%loop(&ID, &lat, &long)
%end;
%mend;


%macro loop(ID_base, lat_base, long_base);
data dist;
set locations;
keep ID_base closest_ID;
retain closest_dist closest_ID;

ID_base = "&ID_base";
lat_base = &lat_base / 1000000;
long_base = &long_base / -1000000;

if ID ne ID_base then do;
currest_dist = ....;
end;

if current_dist < closest_dist or closest_dist = . then do;
closest_dist = current_dist;
closest_ID = ID;
end;


if key = &obs then output;
run;

proc append force data = dist base = dist_total;
run;

%mend;


%loop2;
2 REPLIES 2
data_null__
Jade | Level 19
You might find this paper useful [pre]http://www.nesug.org/Proceedings/nesug03/at/at008.pdf[/pre]I think the problem this paper discusses may be exactly what you are trying to do.

There may be a SAS procedure that does, perhaps someone with more experience will know.
Ksharp
Super User
Hi.
In the documentation of proc sql .I found this code ,hope it will help you.



[pre]
proc sql outobs=10;
title ’Neighboring Cities’;
select a.City format=$10., a.State,
a.Latitude ’Lat’, a.Longitude ’Long’,
b.City format=$10., b.State,
b.Latitude ’Lat’, b.Longitude ’Long’,
sqrt(((b.latitude-a.latitude)**2) +
((b.longitude-a.longitude)**2)) as dist format=6.1
from sql.uscitycoords a, sql.uscitycoords b
where a.city ne b.city and
calculated dist =
(select min(sqrt(((d.latitude-c.latitude)**2) +
((d.longitude-c.longitude)**2)))
from sql.uscitycoords c, sql.uscitycoords d
where c.city = a.city and
c.state = a.state and
d.city ne c.city)
order by a.city;
[/pre]


And the best way to get answer is to post some your origin data and output you need.


Ksharp Message was edited by: Ksharp

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1176 views
  • 0 likes
  • 3 in conversation