DATA Step, Macro, Functions and more

how to calculate distance between observations with coordinate points

Reply
Occasional Contributor
Posts: 5

how to calculate distance between observations with coordinate points

Hi,

I have a problem with calculating distance between observations, which is the variable model. Each observation contains 5 coordinate points, {fxi,fyi}, i=1 to 5; The dataset as following, is there any macro or for loop can do the calculation? Thanks!

 

obsmodelfx1fy1fx2fy2fx3fy3fx4fy4fx5fy5
1a12916.512922.2512959.7512990.75129112.25
2b156.937518156.937526.5156.937569156.9375109.5156.9375137
3c140261403114076.5140116140142.75
4d109.718824109.718829.5109.718853.75109.718887109.7188100.25
5e145.57145.511.5145.546.5145.590145.5110.5
6f94.281251594.2812519.594.2812547.594.2812591.2594.28125111.25
7g15028.51503415083150126150172
8a1479.51471714762147102.5147141
9b91.437512.7591.437517.7591.437548.591.437582.7591.437589.25
10c106.34382.25106.34384.75106.343834.5106.343857.5106.343878
11d130.93759.75130.937514.75130.937551.5130.937579.5130.937599.5
12e143.562521.5143.562526.5143.562574143.5625119.5143.5625142
Super User
Posts: 17,963

Re: how to calculate distance between observations with coordinate points

What would your distance calculation look like? 

Please include sample output. 

Occasional Contributor
Posts: 5

Re: how to calculate distance between observations with coordinate points

Hi,

 

The function used to caluculate euclidean distance should be dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. Thanks

Super User
Posts: 9,691

Re: how to calculate distance between observations with coordinate points

if it was Eduical distance, 

Data step also can do that.

 

data have;
infile cards truncover expandtabs;
input obs	model $	fx1	fy1	fx2	fy2	fx3	fy3	fx4	fy4	fx5	fy5;
cards;
1   a   129 16.5    129 22.25   129 59.75   129 90.75 129   112.25
2   b   156.9375    18  156.9375    26.5    156.9375    69  156.9375    109.5   156.9375    137
3   c   140 26  140 31  140 76.5    140 116 140 142.75
4   d   109.7188    24  109.7188    29.5    109.7188    53.75   109.7188    87  109.7188    100.25
5   e   145.5   7   145.5   11.5    145.5   46.5    145.5   90  145.5   110.5
6   f   94.28125    15  94.28125    19.5    94.28125    47.5    94.28125    91.25   94.28125    111.25
7   g   150 28.5    150 34  150 83  150 126 150 172
8   a   147 9.5 147 17  147 62  147 102.5   147 141
9   b   91.4375 12.75   91.4375 17.75   91.4375 48.5    91.4375 82.75   91.4375 89.25
10  c   106.3438    2.25    106.3438    4.75    106.3438    34.5    106.3438    57.5    106.3438    78
11  d   130.9375    9.75    130.9375    14.75   130.9375    51.5    130.9375    79.5    130.9375    99.5
12  e   143.5625    21.5    143.5625    26.5    143.5625    74  143.5625    119.5   143.5625    142
;
run;
proc iml;
var_x='fx1':'fx5';
var_y='fy1':'fy5';
use have;
read all var var_x into x;
read all var var_y into y;
close;

distance= sqrt((y-x)[,##]);

create distance var{distance};
append;
close;

data want;
 merge have distance;
run;
Occasional Contributor
Posts: 5

Re: how to calculate distance between observations with coordinate points

Hi,

 

Thanks for help. This seems work but not get the right result, because the function used to caluculate euclidean distance is 

dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. So should the sum equation be modified? Thanks.

Super User
Posts: 17,963

Re: how to calculate distance between observations with coordinate points

Look at proc distance with method=Euclid. 

Super User
Posts: 9,691

Re: how to calculate distance between observations with coordinate points

data have;
infile cards truncover expandtabs;
input obs	model $	fx1	fy1	fx2	fy2	fx3	fy3	fx4	fy4	fx5	fy5;
cards;
1   a   129 16.5    129 22.25   129 59.75   129 90.75 129   112.25
2   b   156.9375    18  156.9375    26.5    156.9375    69  156.9375    109.5   156.9375    137
3   c   140 26  140 31  140 76.5    140 116 140 142.75
4   d   109.7188    24  109.7188    29.5    109.7188    53.75   109.7188    87  109.7188    100.25
5   e   145.5   7   145.5   11.5    145.5   46.5    145.5   90  145.5   110.5
6   f   94.28125    15  94.28125    19.5    94.28125    47.5    94.28125    91.25   94.28125    111.25
7   g   150 28.5    150 34  150 83  150 126 150 172
8   a   147 9.5 147 17  147 62  147 102.5   147 141
9   b   91.4375 12.75   91.4375 17.75   91.4375 48.5    91.4375 82.75   91.4375 89.25
10  c   106.3438    2.25    106.3438    4.75    106.3438    34.5    106.3438    57.5    106.3438    78
11  d   130.9375    9.75    130.9375    14.75   130.9375    51.5    130.9375    79.5    130.9375    99.5
12  e   143.5625    21.5    143.5625    26.5    143.5625    74  143.5625    119.5   143.5625    142
;
run;
data want1;
 set have;
 array x{*} fx:;
 array y{*} fy:;
 sum=0;
 do i=1 to dim(x);
  sum+(y{i}-x{i})**2;
 end;
  distance=sqrt(sum);
 drop sum i;
run;
Super User
Posts: 17,963

Re: how to calculate distance between observations with coordinate points

@Ksharp I think the OP wants distance between different rows. 

Occasional Contributor
Posts: 5

Re: how to calculate distance between observations with coordinate points

[ Edited ]

Correctly. But each row composed of 5 {fxi,fyi} points. If I used Proc distance, I won't get correct result. For proc distance with Euclid, the calculate function will be dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2+ (fx21-fx22)**2+ (fy21-fy22)**2+...+ (fx51-fx52)**2+ (fy51-fy52)**2]. However, what I want to get is: dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51-fx52)**2+ (fy51-fy52)**2].

Super User
Posts: 17,963

Re: how to calculate distance between observations with coordinate points

Are you sure? Post the code you've used please.

Occasional Contributor
Posts: 5

Re: how to calculate distance between observations with coordinate points

proc distance data=use method= euclid out=want;
var interval(fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5);
id model;
run;
the distance between fist obs and second obs is 79.1, however, the correct value should be 156.6.
Super User
Posts: 17,963

Re: how to calculate distance between observations with coordinate points

It might be worth switching it to a long format then and grouping to be able to use the formula.

 

Otherwise a SQL cross join as long as you're willing to type  out the formula.

 

proc sql;

select a.*, b.fx1 as fx12, b.fy1 as fy12 ...,

sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51-fx52)**2+ (fy51-fy52)**2] as dist

from have as a

cross join have as b;

quit;

Ask a Question
Discussion stats
  • 11 replies
  • 412 views
  • 0 likes
  • 3 in conversation