BookmarkSubscribeRSS Feed
AlexeyS
Pyrite | Level 9

Hello, please help me.I want to build kernel-k-means.

i have only basic sas tools.

i have the next data(example) :

d_temp1  d_temp2

0.1          1

-0.1          2

1            -1

2.1          4

first question how can i transpose my column data to matrix 4*2?

second question is, I created kernel matrix from Gaussian function , so in this example, i got matrix 4*4 , in sas i got row data matrix, d1-d16.

Now i have to random check two observation, two row observation, choose first clusters, but i can't do it because in spite of defining matrix,it's not matrix, because i cannot choose two rows.

What i should do? how can i work?

thank you

4 REPLIES 4
M_Maldonado
Barite | Level 11

Hi Alexey,

You need a data mining license to use PROC HPCLUS. According to this link (Cluster Analysis), if you have SAS/Stat you have access to PROC CLUSTER and PROC FASTCLUS.

If you want to handle data sets as matrices, you should use SAS Interactive Matrix Language (SAS/IML Software). But I think you can skip this step if you use a proc that does clustering.

I hope this helps!

-Miguel

AlexeyS
Pyrite | Level 9

Hello Miguel, thank you for your answer.

I develop Kernel K-means by myself and unfortunately i don't have SAS/IML, otherwise i didn't had a problem.

i tried to do it with arrays and i saw i have a problem, i simply can't do it.

i saw two problems :

1. i get two column vector and i want to define them as matrix , and i couldn't do it.(d-dimension)

for example :

      d1  d2

x1  1    2

x2   0.1 3

x3  -1   1.1

i want to define matrix as 3*2 with array

2. assume i did it' i created kernel matrix, and i need in first step to choose two random observation for clustering, and because it as if matrix, i can't random choose row, because in SAS, i have only one row.

what do you think about the problem? in my opinion we have to buy IML, i can't to solve matrix problems with basic tools.

M_Maldonado
Barite | Level 11

Alexey,

I could not workaround this without IML. If you want to code matrices and linear algebra, you really need IML.

One of the advantages of the data-step code is that it processes data really fast by processing one row of your data set at a time. Unfortunately this does not help you do what you are trying to do with arrays.

I hope you get to try one of the cluster algorithms in SAS/STAT or code your own in IML.

AlexeyS
Pyrite | Level 9

thank you

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1547 views
  • 0 likes
  • 2 in conversation