My data looks like this:-
Latitude Longitude
26.89155 75.82533
26.89156 75.82563
26.896437 75.827229
26.89111 75.819003
26.899911 75.813517
26.89716 75.813694
26.897943 75.813052
26.89255 75.813286
I have to make clusters of areas with a radius of 250metres.
So what would your output data look like?
Check GEODISTANCE() to get the distance between two points and use SQL to get what you want.
@Ksharp wrote:
Check GEODISTANCE() to get the distance between two points and use SQL to get what you want.
Maybe GEODIST ?
Yeah. Too many function name in my mind.
data have;
input Latitude Longitude;
id+1;
cards;
26.89155 75.82533
26.89156 75.82563
26.896437 75.827229
26.89111 75.819003
26.899911 75.813517
26.89716 75.813694
26.897943 75.813052
26.89255 75.813286
;
run;
proc sql;
select a.*,b.id as _id,geodist(a.Latitude,a.Longitude,b.Latitude,b.Longitude,'M') as dis
from have as a,have as b
where a.id ne b.id
order by id,_id;
quit;
Firstly Thanks a lot for your solution,
But the problem i m getting is that this solution makes the same input(lat,long) appear a number of times.
What I want is that points that form a cluster be grouped as cluster_1 and so on.
I hope you understand my problem.
OK. Here is .
data have;
input Latitude Longitude;
id+1;
cards;
26.89155 75.82533
26.89156 75.82563
26.896437 75.827229
26.89111 75.819003
26.899911 75.813517
26.89716 75.813694
26.897943 75.813052
26.89255 75.813286
;
run;
proc sql;
create table temp as
select a.*,b.id as _id,
geodist(a.Latitude,a.Longitude,b.Latitude,b.Longitude,'k')*1000 as dis
from have as a,have as b
where a.id ne b.id and calculated dis < 250
order by id,_id;
quit;
data have;
set temp;
call sortn(id,_id);
keep id _id;
rename id=from _id=to;
run;
/********Start to cluster***********/
data full;
set have end=last;
if _n_ eq 1 then do;
declare hash h();
h.definekey('node');
h.definedata('node');
h.definedone();
end;
output;
node=from; h.replace();
from=to; to=node;
output;
node=from; h.replace();
if last then h.output(dataset:'node');
drop node;
run;
data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();
if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
from_to.definekey('from');
from_to.definedata('to');
from_to.definedone();
if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
no.definekey('node');
no.definedata('node');
no.definedone();
do while(hi_no.next()=0);
household+1; output;
count=1;
key=node;_ha.add();
last=node;ha.add();
rc=hi.first();
do while(rc=0);
from=last;rx=from_to.find();
do while(rx=0);
key=to;ry=_ha.check();
if ry ne 0 then do;
node=to;output;rr=no.remove(key:node);
key=to;_ha.add();
count+1;
last=to;ha.add();
end;
rx=from_to.find_next();
end;
rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.