05-15-2016 12:25 AM

Hi all,

I am new to this forum.

I have the following problem in SAS EM. The neighbors out of MBR node have wrong orders. To illustrate the problem I wrote a simple program and I used proc pmbr to do the calculations.

```
data t1;
input y x1 x2;
id = _n_;
cards;
1 12 14
1 11 10
0 3 4
0 5 2
;
data t2;
input y x1 x2;
cards;
1 10 12
1 12 12
0 2 1
;
run;
proc dmdb data=t1 dmdbcat=work.temp;
var x1 x2;
class y;
run;
proc pmbr data=t1 dmdbcat=work.temp k=1 method=scan outest=t1_out
neighbors ;
target y;
id id;
score outfit=t2_fit data=t2 out=t2_out role=validation;
run;
proc print data=t2_out;
run;
proc pmbr data=t1 dmdbcat=work.temp k=2 method=scan outest=t1_out
neighbors ;
target y;
id id;
score outfit=t2_fit data=t2 out=t2_out role=validation;
run;
proc print data=t2_out;
run;
proc pmbr data=t1 dmdbcat=work.temp k=3 method=scan outest=t1_out
neighbors ;
target y;
id id;
score outfit=t2_fit data=t2 out=t2_out role=validation;
run;
proc print data=t2_out;
run;
```

as you see from the output the orders of neighbors are not correct, i.e. in the first output _n1 is 2 but the second output _n1 is 1. How I can produce the values of _n: in such a way that _n1 shows the first nn , _n2 shows the second nn, ...?

Thanks.

05-15-2016 01:05 AM

I can't run those procs because I don't have EM, but couldn't you just sort the output dataset?

05-17-2016 05:55 PM

thanks for your reply.

unfortunately it is not possible to sort them.

05-16-2016 01:01 PM

I don't believe there is any ordering implied by the columns _N1, _N2,.. They are just showing the top K nearest neighbors, not necessarily ordered by the nearest to farthest since they all have equal weight when scoring.

05-17-2016 05:54 PM

Thanks for your reply.

I see.

However wondering is there any technical difficulty (or benefit to not preserve ordering) for SAS to save _N1,_N2,... while preserving the orders?

Because the ordered _N1, _N2,... has some benefits, e.g. customized weighted NN and/or easy way to find the optimum K, ...