Solved
Contributor
Posts: 23

Calculating a dynamic cluster average based on geo-coordinates

Hi all

Taking the data below, I have a challenge calculating the cluster average for variable x,. where the cluster average should be based on all observations no further than 250 meters away. In other words, we could person A living at

``36.0980903	-109.0386987``

having an income of x = 1000 \$ a week. My goal would be to compare his income to the average income y of citizens in his neighbourhood living within a radius of 250 meters.

This would mean a dynamic and possibly different cluster group for every observation.

How would this be done?

Thanks

``````data have1;
infile cards;
input Latitude     Longitude     x;
cards;
36.0980903	-109.0386987	1000
36.0984013	-109.0385617	1500
36.0989923	-109.0300557	2000
36.0976343	-109.0301467	750
36.0969993	-109.0333207	850
36.0993153	-109.0354437	900
36.0997053	-109.0357997	347
36.0976233	-109.0287737	650
36.1013493	-109.0335057	1100
36.1023103	-109.0330507	1200
36.1018293	-109.0341797	952
36.0861703	-109.0128447	652
36.1056053	-109.0212337	311
36.1065093	-109.0217417	1000
36.1065093	-109.0217417	1500
36.0983103	-109.0084197	2000
36.1019023	-109.0176657	750
36.0975443	-109.0091287	850
36.0928063	-109.0243767	900
36.0990893	-109.0535547	347
36.0980763	-109.0530487	650
36.0962203	-109.0111137	1100
36.1026333	-109.0195427	1200
36.1030713	-109.0249407	952
36.0900623	-109.0227547	652
36.1044493	-109.0112987	311
36.1093803	-109.0201137	1000
36.1085193	-109.0194697	1500
36.1093893	-109.0194947	2000
36.0871543	-109.0235497	750
36.0880483	-109.0244707	850
36.1016473	-109.0191277	900
36.0939833	-109.0551347	347
36.0936903	-109.0535577	650
36.0988083	-109.0595137	1100
36.0920383	-109.0488857	1200
36.0916593	-109.0474277	952
36.0921533	-109.0462577	652
36.1099053	-109.0181187	311
36.0988563	-109.0462907	5000;
run;		``````

Accepted Solutions
Solution
‎12-08-2016 03:52 AM
Super User
Posts: 10,778

Re: Calculating a dynamic cluster average based on geo-coordinates

```This could give you a start.

data have;
infile cards truncover expandtabs;
id+1;
input Latitude     Longitude     x;
cards;
36.0980903	-109.0386987	1000
36.0984013	-109.0385617	1500
36.0989923	-109.0300557	2000
36.0976343	-109.0301467	750
36.0969993	-109.0333207	850
36.0993153	-109.0354437	900
36.0997053	-109.0357997	347
36.0976233	-109.0287737	650
36.1013493	-109.0335057	1100
36.1023103	-109.0330507	1200
36.1018293	-109.0341797	952
36.0861703	-109.0128447	652
36.1056053	-109.0212337	311
36.1065093	-109.0217417	1000
36.1065093	-109.0217417	1500
36.0983103	-109.0084197	2000
36.1019023	-109.0176657	750
36.0975443	-109.0091287	850
36.0928063	-109.0243767	900
36.0990893	-109.0535547	347
36.0980763	-109.0530487	650
36.0962203	-109.0111137	1100
36.1026333	-109.0195427	1200
36.1030713	-109.0249407	952
36.0900623	-109.0227547	652
36.1044493	-109.0112987	311
36.1093803	-109.0201137	1000
36.1085193	-109.0194697	1500
36.1093893	-109.0194947	2000
36.0871543	-109.0235497	750
36.0880483	-109.0244707	850
36.1016473	-109.0191277	900
36.0939833	-109.0551347	347
36.0936903	-109.0535577	650
36.0988083	-109.0595137	1100
36.0920383	-109.0488857	1200
36.0916593	-109.0474277	952
36.0921533	-109.0462577	652
36.1099053	-109.0181187	311
36.0988563	-109.0462907	5000
;
run;
proc sql;
create table want as
select a.*,geodist(a.Latitude,a.Longitude,b.Latitude,b.Longitude)*1000 as distance
from have as a, have as b
where a.id ne b.id and calculated distance le 250
order by a.id;
quit;

```

All Replies
Solution
‎12-08-2016 03:52 AM
Super User
Posts: 10,778

Re: Calculating a dynamic cluster average based on geo-coordinates

```This could give you a start.

data have;
infile cards truncover expandtabs;
id+1;
input Latitude     Longitude     x;
cards;
36.0980903	-109.0386987	1000
36.0984013	-109.0385617	1500
36.0989923	-109.0300557	2000
36.0976343	-109.0301467	750
36.0969993	-109.0333207	850
36.0993153	-109.0354437	900
36.0997053	-109.0357997	347
36.0976233	-109.0287737	650
36.1013493	-109.0335057	1100
36.1023103	-109.0330507	1200
36.1018293	-109.0341797	952
36.0861703	-109.0128447	652
36.1056053	-109.0212337	311
36.1065093	-109.0217417	1000
36.1065093	-109.0217417	1500
36.0983103	-109.0084197	2000
36.1019023	-109.0176657	750
36.0975443	-109.0091287	850
36.0928063	-109.0243767	900
36.0990893	-109.0535547	347
36.0980763	-109.0530487	650
36.0962203	-109.0111137	1100
36.1026333	-109.0195427	1200
36.1030713	-109.0249407	952
36.0900623	-109.0227547	652
36.1044493	-109.0112987	311
36.1093803	-109.0201137	1000
36.1085193	-109.0194697	1500
36.1093893	-109.0194947	2000
36.0871543	-109.0235497	750
36.0880483	-109.0244707	850
36.1016473	-109.0191277	900
36.0939833	-109.0551347	347
36.0936903	-109.0535577	650
36.0988083	-109.0595137	1100
36.0920383	-109.0488857	1200
36.0916593	-109.0474277	952
36.0921533	-109.0462577	652
36.1099053	-109.0181187	311
36.0988563	-109.0462907	5000
;
run;
proc sql;
create table want as
select a.*,geodist(a.Latitude,a.Longitude,b.Latitude,b.Longitude)*1000 as distance
from have as a, have as b
where a.id ne b.id and calculated distance le 250
order by a.id;
quit;

```
Super User
Posts: 23,724

Re: Calculating a dynamic cluster average based on geo-coordinates

How big is your actual data since you have to make a lot of comparisons, efficiency may be very important.

You can can look into a SQL cross join and filter based on

WHERE geodist(....., 'M') < 250;

Contributor
Posts: 23

Re: Calculating a dynamic cluster average based on geo-coordinates

@ReezaThe data is about 100k obs.

But will I be able to make a dynamic / different cluster for every observation with SQL?

It will surely be the most efficiebt, but Im not sure how to generate the code for that?

Contributor
Posts: 23

Re: Calculating a dynamic cluster average based on geo-coordinates

Thanks for you contributions.

I believe it can be solved with a proper SQL join and the geodist function.Anyone knows if geodist works with different latitude/longitude variables, i.e. degrees vs. GPS coordinates / decimal degrees ?

/ MikA

Super User
Posts: 10,778

Re: Calculating a dynamic cluster average based on geo-coordinates

```Yes. I think so. Check the documentation and see how to use GEODIST().

```
☑ This topic is solved.