06-15-2015 02:09 AM

Hello, please help me.I want to build kernel-k-means.

i have only basic sas tools.

i have the next data(example) :

d_temp1 d_temp2

0.1 1

-0.1 2

1 -1

2.1 4

first question how can i transpose my column data to matrix 4*2?

second question is, I created kernel matrix from Gaussian function , so in this example, i got matrix 4*4 , in sas i got row data matrix, d1-d16.

Now i have to random check two observation, two row observation, choose first clusters, but i can't do it because in spite of defining matrix,it's not matrix, because i cannot choose two rows.

What i should do? how can i work?

thank you

06-15-2015 09:10 AM

Hi Alexey,

You need a data mining license to use PROC HPCLUS. According to this link (Cluster Analysis), if you have SAS/Stat you have access to PROC CLUSTER and PROC FASTCLUS.

If you want to handle data sets as matrices, you should use SAS Interactive Matrix Language (SAS/IML Software). But I think you can skip this step if you use a proc that does clustering.

I hope this helps!

-Miguel

06-15-2015 02:19 PM

Hello Miguel, thank you for your answer.

I develop Kernel K-means by myself and unfortunately i don't have SAS/IML, otherwise i didn't had a problem.

i tried to do it with arrays and i saw i have a problem, i simply can't do it.

i saw two problems :

1. i get two column vector and i want to define them as matrix , and i couldn't do it.(d-dimension)

for example :

d1 d2

x1 1 2

x2 0.1 3

x3 -1 1.1

i want to define matrix as 3*2 with array

2. assume i did it' i created kernel matrix, and i need in first step to choose two random observation for clustering, and because it as if matrix, i can't random choose row, because in SAS, i have only one row.

what do you think about the problem? in my opinion we have to buy IML, i can't to solve matrix problems with basic tools.

06-16-2015 07:33 PM

Alexey,

I could not workaround this without IML. If you want to code matrices and linear algebra, you really need IML.

One of the advantages of the data-step code is that it processes data really fast by processing one row of your data set at a time. Unfortunately this does not help you do what you are trying to do with arrays.

I hope you get to try one of the cluster algorithms in SAS/STAT or code your own in IML.

06-16-2015 11:24 PM

thank you