BookmarkSubscribeRSS Feed
buffheman
Calcite | Level 5
I need some advice on how best to process this problem. I have a dataset of say 100,000 locations, with nonsequential IDs, a latitude and a longitude. How would I best go about finding, for each ID, the closest location to it. Don't worry about the actual calculation of distance, just wondering how best to process it. I don't have any formal SAS training and I 've never used arrays, so any tips would be appreciated. This is my current setup, it's certainly unwieldy, but I was rushed and didn't know how else to do it.

I created an observation number for sequencing called key and a macro variable obs to hold the total number of observations. Then I call a macro for each ID and its location. The macro uses retain statements to keep track of the closest distance so far, and loops through until the last observation, then outputs 1 row. So it's creating 100,000 1 row tables that I append together. Seems ridiculous. Thanks for your help.


%macro loop2;
%do i = 1 %to &obs;
data _null_;
set locations;
where key = &i;
call symput ('ID', ID);
call symput ('lat', latitude);
call symput ('long', longitude);
run;
%loop(&ID, &lat, &long)
%end;
%mend;


%macro loop(ID_base, lat_base, long_base);
data dist;
set locations;
keep ID_base closest_ID;
retain closest_dist closest_ID;

ID_base = "&ID_base";
lat_base = &lat_base / 1000000;
long_base = &long_base / -1000000;

if ID ne ID_base then do;
currest_dist = ....;
end;

if current_dist < closest_dist or closest_dist = . then do;
closest_dist = current_dist;
closest_ID = ID;
end;


if key = &obs then output;
run;

proc append force data = dist base = dist_total;
run;

%mend;


%loop2;
2 REPLIES 2
data_null__
Jade | Level 19
You might find this paper useful [pre]http://www.nesug.org/Proceedings/nesug03/at/at008.pdf[/pre]I think the problem this paper discusses may be exactly what you are trying to do.

There may be a SAS procedure that does, perhaps someone with more experience will know.
Ksharp
Super User
Hi.
In the documentation of proc sql .I found this code ,hope it will help you.



[pre]
proc sql outobs=10;
title ’Neighboring Cities’;
select a.City format=$10., a.State,
a.Latitude ’Lat’, a.Longitude ’Long’,
b.City format=$10., b.State,
b.Latitude ’Lat’, b.Longitude ’Long’,
sqrt(((b.latitude-a.latitude)**2) +
((b.longitude-a.longitude)**2)) as dist format=6.1
from sql.uscitycoords a, sql.uscitycoords b
where a.city ne b.city and
calculated dist =
(select min(sqrt(((d.latitude-c.latitude)**2) +
((d.longitude-c.longitude)**2)))
from sql.uscitycoords c, sql.uscitycoords d
where c.city = a.city and
c.state = a.state and
d.city ne c.city)
order by a.city;
[/pre]


And the best way to get answer is to post some your origin data and output you need.


Ksharp Message was edited by: Ksharp

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 706 views
  • 0 likes
  • 3 in conversation