DATA Step, Macro, Functions and more

Array or Macro?

New Contributor
Posts: 4

Array or Macro?

I need some advice on how best to process this problem. I have a dataset of say 100,000 locations, with nonsequential IDs, a latitude and a longitude. How would I best go about finding, for each ID, the closest location to it. Don't worry about the actual calculation of distance, just wondering how best to process it. I don't have any formal SAS training and I 've never used arrays, so any tips would be appreciated. This is my current setup, it's certainly unwieldy, but I was rushed and didn't know how else to do it.

I created an observation number for sequencing called key and a macro variable obs to hold the total number of observations. Then I call a macro for each ID and its location. The macro uses retain statements to keep track of the closest distance so far, and loops through until the last observation, then outputs 1 row. So it's creating 100,000 1 row tables that I append together. Seems ridiculous. Thanks for your help.

%macro loop2;
%do i = 1 %to &obs;
data _null_;
set locations;
where key = &i;
call symput ('ID', ID);
call symput ('lat', latitude);
call symput ('long', longitude);
%loop(&ID, &lat, &long)

%macro loop(ID_base, lat_base, long_base);
data dist;
set locations;
keep ID_base closest_ID;
retain closest_dist closest_ID;

ID_base = "&ID_base";
lat_base = &lat_base / 1000000;
long_base = &long_base / -1000000;

if ID ne ID_base then do;
currest_dist = ....;

if current_dist < closest_dist or closest_dist = . then do;
closest_dist = current_dist;
closest_ID = ID;

if key = &obs then output;

proc append force data = dist base = dist_total;


Respected Advisor
Posts: 3,777

Re: Array or Macro?

You might find this paper useful [pre][/pre]I think the problem this paper discusses may be exactly what you are trying to do.

There may be a SAS procedure that does, perhaps someone with more experience will know.
Super User
Posts: 9,681

Re: Array or Macro?

In the documentation of proc sql .I found this code ,hope it will help you.

proc sql outobs=10;
title ’Neighboring Cities’;
select a.City format=$10., a.State,
a.Latitude ’Lat’, a.Longitude ’Long’,
b.City format=$10., b.State,
b.Latitude ’Lat’, b.Longitude ’Long’,
sqrt(((b.latitude-a.latitude)**2) +
((b.longitude-a.longitude)**2)) as dist format=6.1
from sql.uscitycoords a, sql.uscitycoords b
where ne and
calculated dist =
(select min(sqrt(((d.latitude-c.latitude)**2) +
from sql.uscitycoords c, sql.uscitycoords d
where = and
c.state = a.state and ne
order by;

And the best way to get answer is to post some your origin data and output you need.

Ksharp Message was edited by: Ksharp
Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation