BookmarkSubscribeRSS Feed
buffheman
Calcite | Level 5
I need some advice on how best to process this problem. I have a dataset of say 100,000 locations, with nonsequential IDs, a latitude and a longitude. How would I best go about finding, for each ID, the closest location to it. Don't worry about the actual calculation of distance, just wondering how best to process it. I don't have any formal SAS training and I 've never used arrays, so any tips would be appreciated. This is my current setup, it's certainly unwieldy, but I was rushed and didn't know how else to do it.

I created an observation number for sequencing called key and a macro variable obs to hold the total number of observations. Then I call a macro for each ID and its location. The macro uses retain statements to keep track of the closest distance so far, and loops through until the last observation, then outputs 1 row. So it's creating 100,000 1 row tables that I append together. Seems ridiculous. Thanks for your help.


%macro loop2;
%do i = 1 %to &obs;
data _null_;
set locations;
where key = &i;
call symput ('ID', ID);
call symput ('lat', latitude);
call symput ('long', longitude);
run;
%loop(&ID, &lat, &long)
%end;
%mend;


%macro loop(ID_base, lat_base, long_base);
data dist;
set locations;
keep ID_base closest_ID;
retain closest_dist closest_ID;

ID_base = "&ID_base";
lat_base = &lat_base / 1000000;
long_base = &long_base / -1000000;

if ID ne ID_base then do;
currest_dist = ....;
end;

if current_dist < closest_dist or closest_dist = . then do;
closest_dist = current_dist;
closest_ID = ID;
end;


if key = &obs then output;
run;

proc append force data = dist base = dist_total;
run;

%mend;


%loop2;
2 REPLIES 2
data_null__
Jade | Level 19
You might find this paper useful [pre]http://www.nesug.org/Proceedings/nesug03/at/at008.pdf[/pre]I think the problem this paper discusses may be exactly what you are trying to do.

There may be a SAS procedure that does, perhaps someone with more experience will know.
Ksharp
Super User
Hi.
In the documentation of proc sql .I found this code ,hope it will help you.



[pre]
proc sql outobs=10;
title ’Neighboring Cities’;
select a.City format=$10., a.State,
a.Latitude ’Lat’, a.Longitude ’Long’,
b.City format=$10., b.State,
b.Latitude ’Lat’, b.Longitude ’Long’,
sqrt(((b.latitude-a.latitude)**2) +
((b.longitude-a.longitude)**2)) as dist format=6.1
from sql.uscitycoords a, sql.uscitycoords b
where a.city ne b.city and
calculated dist =
(select min(sqrt(((d.latitude-c.latitude)**2) +
((d.longitude-c.longitude)**2)))
from sql.uscitycoords c, sql.uscitycoords d
where c.city = a.city and
c.state = a.state and
d.city ne c.city)
order by a.city;
[/pre]


And the best way to get answer is to post some your origin data and output you need.


Ksharp Message was edited by: Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 684 views
  • 0 likes
  • 3 in conversation