Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- how to calculate distance between observations wit...

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-26-2016 01:06 AM

Hi,

I have a problem with calculating distance between observations, which is the variable model. Each observation contains 5 coordinate points, {fxi,fyi}, i=1 to 5; The dataset as following, is there any macro or for loop can do the calculation? Thanks!

obs | model | fx1 | fy1 | fx2 | fy2 | fx3 | fy3 | fx4 | fy4 | fx5 | fy5 |

1 | a | 129 | 16.5 | 129 | 22.25 | 129 | 59.75 | 129 | 90.75 | 129 | 112.25 |

2 | b | 156.9375 | 18 | 156.9375 | 26.5 | 156.9375 | 69 | 156.9375 | 109.5 | 156.9375 | 137 |

3 | c | 140 | 26 | 140 | 31 | 140 | 76.5 | 140 | 116 | 140 | 142.75 |

4 | d | 109.7188 | 24 | 109.7188 | 29.5 | 109.7188 | 53.75 | 109.7188 | 87 | 109.7188 | 100.25 |

5 | e | 145.5 | 7 | 145.5 | 11.5 | 145.5 | 46.5 | 145.5 | 90 | 145.5 | 110.5 |

6 | f | 94.28125 | 15 | 94.28125 | 19.5 | 94.28125 | 47.5 | 94.28125 | 91.25 | 94.28125 | 111.25 |

7 | g | 150 | 28.5 | 150 | 34 | 150 | 83 | 150 | 126 | 150 | 172 |

8 | a | 147 | 9.5 | 147 | 17 | 147 | 62 | 147 | 102.5 | 147 | 141 |

9 | b | 91.4375 | 12.75 | 91.4375 | 17.75 | 91.4375 | 48.5 | 91.4375 | 82.75 | 91.4375 | 89.25 |

10 | c | 106.3438 | 2.25 | 106.3438 | 4.75 | 106.3438 | 34.5 | 106.3438 | 57.5 | 106.3438 | 78 |

11 | d | 130.9375 | 9.75 | 130.9375 | 14.75 | 130.9375 | 51.5 | 130.9375 | 79.5 | 130.9375 | 99.5 |

12 | e | 143.5625 | 21.5 | 143.5625 | 26.5 | 143.5625 | 74 | 143.5625 | 119.5 | 143.5625 | 142 |

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 01:31 AM

What would your distance calculation look like?

Please include sample output.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

10-26-2016 10:28 AM

Hi,

The function used to caluculate euclidean distance should be dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. Thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 07:11 AM

if it was Eduical distance,

Data step also can do that.

```
data have;
infile cards truncover expandtabs;
input obs model $ fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5;
cards;
1 a 129 16.5 129 22.25 129 59.75 129 90.75 129 112.25
2 b 156.9375 18 156.9375 26.5 156.9375 69 156.9375 109.5 156.9375 137
3 c 140 26 140 31 140 76.5 140 116 140 142.75
4 d 109.7188 24 109.7188 29.5 109.7188 53.75 109.7188 87 109.7188 100.25
5 e 145.5 7 145.5 11.5 145.5 46.5 145.5 90 145.5 110.5
6 f 94.28125 15 94.28125 19.5 94.28125 47.5 94.28125 91.25 94.28125 111.25
7 g 150 28.5 150 34 150 83 150 126 150 172
8 a 147 9.5 147 17 147 62 147 102.5 147 141
9 b 91.4375 12.75 91.4375 17.75 91.4375 48.5 91.4375 82.75 91.4375 89.25
10 c 106.3438 2.25 106.3438 4.75 106.3438 34.5 106.3438 57.5 106.3438 78
11 d 130.9375 9.75 130.9375 14.75 130.9375 51.5 130.9375 79.5 130.9375 99.5
12 e 143.5625 21.5 143.5625 26.5 143.5625 74 143.5625 119.5 143.5625 142
;
run;
proc iml;
var_x='fx1':'fx5';
var_y='fy1':'fy5';
use have;
read all var var_x into x;
read all var var_y into y;
close;
distance= sqrt((y-x)[,##]);
create distance var{distance};
append;
close;
data want;
merge have distance;
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

10-26-2016 10:53 AM

Hi,

Thanks for help. This seems work but not get the right result, because the function used to caluculate euclidean distance is

dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. So should the sum equation be modified? Thanks.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 12:54 PM

Look at proc distance with method=Euclid.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 07:15 AM

```
data have;
infile cards truncover expandtabs;
input obs model $ fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5;
cards;
1 a 129 16.5 129 22.25 129 59.75 129 90.75 129 112.25
2 b 156.9375 18 156.9375 26.5 156.9375 69 156.9375 109.5 156.9375 137
3 c 140 26 140 31 140 76.5 140 116 140 142.75
4 d 109.7188 24 109.7188 29.5 109.7188 53.75 109.7188 87 109.7188 100.25
5 e 145.5 7 145.5 11.5 145.5 46.5 145.5 90 145.5 110.5
6 f 94.28125 15 94.28125 19.5 94.28125 47.5 94.28125 91.25 94.28125 111.25
7 g 150 28.5 150 34 150 83 150 126 150 172
8 a 147 9.5 147 17 147 62 147 102.5 147 141
9 b 91.4375 12.75 91.4375 17.75 91.4375 48.5 91.4375 82.75 91.4375 89.25
10 c 106.3438 2.25 106.3438 4.75 106.3438 34.5 106.3438 57.5 106.3438 78
11 d 130.9375 9.75 130.9375 14.75 130.9375 51.5 130.9375 79.5 130.9375 99.5
12 e 143.5625 21.5 143.5625 26.5 143.5625 74 143.5625 119.5 143.5625 142
;
run;
data want1;
set have;
array x{*} fx:;
array y{*} fy:;
sum=0;
do i=1 to dim(x);
sum+(y{i}-x{i})**2;
end;
distance=sqrt(sum);
drop sum i;
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

10-26-2016 02:19 PM

@Ksharp I think the OP wants distance between different rows.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

10-26-2016 03:14 PM - edited 10-26-2016 03:15 PM

Correctly. But each row composed of 5 {fxi,fyi} points. If I used Proc distance, I won't get correct result. For proc distance with Euclid, the calculate function will be dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2+ (fx21-fx22)**2+ (fy21-fy22)**2+...+ (fx51-fx52)**2+ (fy51-fy52)**2]. However, what I want to get is: dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51-fx52)**2+ (fy51-fy52)**2].

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 03:36 PM

Are you sure? Post the code you've used please.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

10-26-2016 03:52 PM

proc distance data=use method= euclid out=want;

var interval(fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5);

id model;

run;

the distance between fist obs and second obs is 79.1, however, the correct value should be 156.6.

var interval(fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5);

id model;

run;

the distance between fist obs and second obs is 79.1, however, the correct value should be 156.6.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ffguo

10-26-2016 06:52 PM

It might be worth switching it to a long format then and grouping to be able to use the formula.

Otherwise a SQL cross join as long as you're willing to type out the formula.

proc sql;

select a.*, b.fx1 as fx12, b.fy1 as fy12 ...,

sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51

from have as a

cross join have as b;

quit;