BookmarkSubscribeRSS Feed
ffguo
Calcite | Level 5

Hi,

I have a problem with calculating distance between observations, which is the variable model. Each observation contains 5 coordinate points, {fxi,fyi}, i=1 to 5; The dataset as following, is there any macro or for loop can do the calculation? Thanks!

 

obsmodelfx1fy1fx2fy2fx3fy3fx4fy4fx5fy5
1a12916.512922.2512959.7512990.75129112.25
2b156.937518156.937526.5156.937569156.9375109.5156.9375137
3c140261403114076.5140116140142.75
4d109.718824109.718829.5109.718853.75109.718887109.7188100.25
5e145.57145.511.5145.546.5145.590145.5110.5
6f94.281251594.2812519.594.2812547.594.2812591.2594.28125111.25
7g15028.51503415083150126150172
8a1479.51471714762147102.5147141
9b91.437512.7591.437517.7591.437548.591.437582.7591.437589.25
10c106.34382.25106.34384.75106.343834.5106.343857.5106.343878
11d130.93759.75130.937514.75130.937551.5130.937579.5130.937599.5
12e143.562521.5143.562526.5143.562574143.5625119.5143.5625142
11 REPLIES 11
Reeza
Super User

What would your distance calculation look like? 

Please include sample output. 

ffguo
Calcite | Level 5

Hi,

 

The function used to caluculate euclidean distance should be dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. Thanks

Ksharp
Super User

if it was Eduical distance, 

Data step also can do that.

 

data have;
infile cards truncover expandtabs;
input obs	model $	fx1	fy1	fx2	fy2	fx3	fy3	fx4	fy4	fx5	fy5;
cards;
1   a   129 16.5    129 22.25   129 59.75   129 90.75 129   112.25
2   b   156.9375    18  156.9375    26.5    156.9375    69  156.9375    109.5   156.9375    137
3   c   140 26  140 31  140 76.5    140 116 140 142.75
4   d   109.7188    24  109.7188    29.5    109.7188    53.75   109.7188    87  109.7188    100.25
5   e   145.5   7   145.5   11.5    145.5   46.5    145.5   90  145.5   110.5
6   f   94.28125    15  94.28125    19.5    94.28125    47.5    94.28125    91.25   94.28125    111.25
7   g   150 28.5    150 34  150 83  150 126 150 172
8   a   147 9.5 147 17  147 62  147 102.5   147 141
9   b   91.4375 12.75   91.4375 17.75   91.4375 48.5    91.4375 82.75   91.4375 89.25
10  c   106.3438    2.25    106.3438    4.75    106.3438    34.5    106.3438    57.5    106.3438    78
11  d   130.9375    9.75    130.9375    14.75   130.9375    51.5    130.9375    79.5    130.9375    99.5
12  e   143.5625    21.5    143.5625    26.5    143.5625    74  143.5625    119.5   143.5625    142
;
run;
proc iml;
var_x='fx1':'fx5';
var_y='fy1':'fy5';
use have;
read all var var_x into x;
read all var var_y into y;
close;

distance= sqrt((y-x)[,##]);

create distance var{distance};
append;
close;

data want;
 merge have distance;
run;
ffguo
Calcite | Level 5

Hi,

 

Thanks for help. This seems work but not get the right result, because the function used to caluculate euclidean distance is 

dis(distance between any two oberservation)= sum(squr((fyi1-fyi2)**2+(fxi1-fxi2)**2)),{fxi1,fyi1} from the first observation, the other from the second observation, i=1 to 5. So should the sum equation be modified? Thanks.

Reeza
Super User

Look at proc distance with method=Euclid. 

Ksharp
Super User
data have;
infile cards truncover expandtabs;
input obs	model $	fx1	fy1	fx2	fy2	fx3	fy3	fx4	fy4	fx5	fy5;
cards;
1   a   129 16.5    129 22.25   129 59.75   129 90.75 129   112.25
2   b   156.9375    18  156.9375    26.5    156.9375    69  156.9375    109.5   156.9375    137
3   c   140 26  140 31  140 76.5    140 116 140 142.75
4   d   109.7188    24  109.7188    29.5    109.7188    53.75   109.7188    87  109.7188    100.25
5   e   145.5   7   145.5   11.5    145.5   46.5    145.5   90  145.5   110.5
6   f   94.28125    15  94.28125    19.5    94.28125    47.5    94.28125    91.25   94.28125    111.25
7   g   150 28.5    150 34  150 83  150 126 150 172
8   a   147 9.5 147 17  147 62  147 102.5   147 141
9   b   91.4375 12.75   91.4375 17.75   91.4375 48.5    91.4375 82.75   91.4375 89.25
10  c   106.3438    2.25    106.3438    4.75    106.3438    34.5    106.3438    57.5    106.3438    78
11  d   130.9375    9.75    130.9375    14.75   130.9375    51.5    130.9375    79.5    130.9375    99.5
12  e   143.5625    21.5    143.5625    26.5    143.5625    74  143.5625    119.5   143.5625    142
;
run;
data want1;
 set have;
 array x{*} fx:;
 array y{*} fy:;
 sum=0;
 do i=1 to dim(x);
  sum+(y{i}-x{i})**2;
 end;
  distance=sqrt(sum);
 drop sum i;
run;
Reeza
Super User

@Ksharp I think the OP wants distance between different rows. 

ffguo
Calcite | Level 5

Correctly. But each row composed of 5 {fxi,fyi} points. If I used Proc distance, I won't get correct result. For proc distance with Euclid, the calculate function will be dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2+ (fx21-fx22)**2+ (fy21-fy22)**2+...+ (fx51-fx52)**2+ (fy51-fy52)**2]. However, what I want to get is: dist=sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51-fx52)**2+ (fy51-fy52)**2].

Reeza
Super User

Are you sure? Post the code you've used please.

ffguo
Calcite | Level 5
proc distance data=use method= euclid out=want;
var interval(fx1 fy1 fx2 fy2 fx3 fy3 fx4 fy4 fx5 fy5);
id model;
run;
the distance between fist obs and second obs is 79.1, however, the correct value should be 156.6.
Reeza
Super User

It might be worth switching it to a long format then and grouping to be able to use the formula.

 

Otherwise a SQL cross join as long as you're willing to type  out the formula.

 

proc sql;

select a.*, b.fx1 as fx12, b.fy1 as fy12 ...,

sqrt[(fx11-fx12)**2+(fy11-fy12)**2]+ sqrt[(fx21-fx22)**2+(fy21-fy22)**2]+...+sqrt[(fx51-fx52)**2+ (fy51-fy52)**2] as dist

from have as a

cross join have as b;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1505 views
  • 0 likes
  • 3 in conversation