BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
umeshgiri48
Obsidian | Level 7

Hi,

 

I would like to implement a nearest neighbors algorithm. More specifically, I have more than 300000 customers and I need to find nearest 100 customers 50 above them and 50 below them on the basis of their Latitude value (variable name) which i have sorted in ascending, suppose customer who is in 1st row then he has 0 customers above him so his closest 100 customers will be 100 below them i.e 1-100 and like wise customer who is on 51st row then his above 50 will be 1-50 customers and below will be 51-100, like wise it will process for all the 300000 customers and a new data set will be created by appending all the data set in one which will be of 300000*100.

 

i am importing the file and then sorting the data on the basis of Latitude value and then assigning the row number after that i am helpless.

 

kind regards

 

proc sort data = sample; by Latitude; run;
data sample;                                                                                                                                           
  set sample ;                                                                                                                                         
  row_number=_n_;                                                                                                                                          
run; 
1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Use PROC MODECLUS. The NEIGHBOR option on the PROC MODECLUS statement produces a table that gives the observation number (or ID value) of nearest neighbors. For example, the following statements produce the observation numbers for the nearest neighbors:

 

/* Use K=p option to find nearest p-1 neighbors */
proc modeclus data=Sample method=1 k=101 Neighbor; /* nearest 100 nbrs */
var x y z w;
run;

 

I suggest you start with a smaller problem, such as the nearest 2 neighbors, before attempting the large problem. The MODECLUS doc has an example for nearest neighbors.

View solution in original post

1 REPLY 1
Rick_SAS
SAS Super FREQ

Use PROC MODECLUS. The NEIGHBOR option on the PROC MODECLUS statement produces a table that gives the observation number (or ID value) of nearest neighbors. For example, the following statements produce the observation numbers for the nearest neighbors:

 

/* Use K=p option to find nearest p-1 neighbors */
proc modeclus data=Sample method=1 k=101 Neighbor; /* nearest 100 nbrs */
var x y z w;
run;

 

I suggest you start with a smaller problem, such as the nearest 2 neighbors, before attempting the large problem. The MODECLUS doc has an example for nearest neighbors.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1226 views
  • 0 likes
  • 2 in conversation