DATA Step, Macro, Functions and more

Array or Macro?

Reply
New Contributor
Posts: 4

Array or Macro?

I need some advice on how best to process this problem. I have a dataset of say 100,000 locations, with nonsequential IDs, a latitude and a longitude. How would I best go about finding, for each ID, the closest location to it. Don't worry about the actual calculation of distance, just wondering how best to process it. I don't have any formal SAS training and I 've never used arrays, so any tips would be appreciated. This is my current setup, it's certainly unwieldy, but I was rushed and didn't know how else to do it.

I created an observation number for sequencing called key and a macro variable obs to hold the total number of observations. Then I call a macro for each ID and its location. The macro uses retain statements to keep track of the closest distance so far, and loops through until the last observation, then outputs 1 row. So it's creating 100,000 1 row tables that I append together. Seems ridiculous. Thanks for your help.


%macro loop2;
%do i = 1 %to &obs;
data _null_;
set locations;
where key = &i;
call symput ('ID', ID);
call symput ('lat', latitude);
call symput ('long', longitude);
run;
%loop(&ID, &lat, &long)
%end;
%mend;


%macro loop(ID_base, lat_base, long_base);
data dist;
set locations;
keep ID_base closest_ID;
retain closest_dist closest_ID;

ID_base = "&ID_base";
lat_base = &lat_base / 1000000;
long_base = &long_base / -1000000;

if ID ne ID_base then do;
currest_dist = ....;
end;

if current_dist < closest_dist or closest_dist = . then do;
closest_dist = current_dist;
closest_ID = ID;
end;


if key = &obs then output;
run;

proc append force data = dist base = dist_total;
run;

%mend;


%loop2;
Respected Advisor
Posts: 3,799

Re: Array or Macro?

Posted in reply to buffheman
You might find this paper useful [pre]http://www.nesug.org/Proceedings/nesug03/at/at008.pdf[/pre]I think the problem this paper discusses may be exactly what you are trying to do.

There may be a SAS procedure that does, perhaps someone with more experience will know.
Super User
Posts: 10,044

Re: Array or Macro?

Posted in reply to buffheman
Hi.
In the documentation of proc sql .I found this code ,hope it will help you.



[pre]
proc sql outobs=10;
title ’Neighboring Cities’;
select a.City format=$10., a.State,
a.Latitude ’Lat’, a.Longitude ’Long’,
b.City format=$10., b.State,
b.Latitude ’Lat’, b.Longitude ’Long’,
sqrt(((b.latitude-a.latitude)**2) +
((b.longitude-a.longitude)**2)) as dist format=6.1
from sql.uscitycoords a, sql.uscitycoords b
where a.city ne b.city and
calculated dist =
(select min(sqrt(((d.latitude-c.latitude)**2) +
((d.longitude-c.longitude)**2)))
from sql.uscitycoords c, sql.uscitycoords d
where c.city = a.city and
c.state = a.state and
d.city ne c.city)
order by a.city;
[/pre]


And the best way to get answer is to post some your origin data and output you need.


Ksharp Message was edited by: Ksharp
Ask a Question
Discussion stats
  • 2 replies
  • 130 views
  • 0 likes
  • 3 in conversation