Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Kernals in Support Vector Machines

Reply
Contributor
Posts: 46

Kernals in Support Vector Machines

This is more theoretical than SAS related:

I continuously see kernel functions in SVMs as k(xi,xj), or k(x,y) or
k(x,z)

Are kernel functions in SVMs functions (dot products) of independent
variables (x's) or a function of independent and dependent
variables(x,y)?

This also confuses my understanding of the relationship between kernel
functions and kernel matrices.

Any help?

I currently have SAS EM 6.2. As far as I know SVMs aren't possible with this version. Any news on their implementation in future releases?

-thanks
SAS Employee
Posts: 31

Re: Kernals in Support Vector Machines

An experimenta linear and nonlinear kernel SVM node is planned for EM 7.1 SAS9.3 mid next year. Thanks for you use of the software. There are quite a few other classification and prediction tools you will want to try. Some users have reported good Gradient Boosting results. Will keep you and the forum up to date.
Contributor
Posts: 46

Re: Kernals in Support Vector Machines

Thanks so much. Looking forward to it. I've actually got my hands full dealing with all that is available in EM anyway- but I like to stay ahead of the curve.
SAS Employee
Posts: 1

Re: Kernals in Support Vector Machines

Hi Slutsky,

To answer your query, the various notations you come across in the kernel literature is dependent on the user group. From a notation perspective x, y (or z) refer to vectors where as xi and xj refers to element i and j within vector x.

A kernel matrix is usually donated with capital K whereas k is the kernel function (dot product). Hence, where as K is a matrix k(x,y) will be only an entry scalar in matrix K.

To answer your second question regarding dependence and independence. The assumption is that the data is iid (identically and identically distributed).

Perhaps this pseudocode will help understand the notation
X - matrix of size nxm (n sample and m features)
K - matrix of size nxn

for i=1 to n
for j=1 to n
K[i,i] = X[i,:]*X[j,:] % a linear dot product
endfor
endfor

a good book is 'introduction to support vector machine' by cristinanini and shawe-taylor
Contributor
Posts: 46

Re: Kernals in Support Vector Machines

Thanks, this really helps. I think I'm getting a better picture. You said:

"where as K is a matrix k(x,y) will be only an entry scalar in matrix K."

I think I understand, but based on what you said is my following interpretation correct?

#1 k(x,y) is a kernal function (which produces a scaler that becomes and entry in matrix K)

If that is true, then does k(x,y) produce a dot product only between xi and xj (elements of the matrix X) or are they dot products also between x and y?

I'm thinking the entries in the kernal matrix are only dot products of xi and xj given your pseudocode , and y is just a 'label'.

But, I've also seen 'kernal functions' depicted in 2 different ways:

gaussian kernal: k(x,y) = exp(-||x-y||^2 / sigma^2)

gaussian kernal: k(xi,xj) = exp(-||xi-xj||^2 / sigma^2)

So I'm still confused on the notation about what are the 'inputs' into the kernel function, are the elements only of some matrix X, or can they also contain elements of Y?

Thanks.
N/A
Posts: 1

Re: Kernals in Support Vector Machines

SlutskyFan:
The Gaussian kernel that you mentioned fits in the scheme set by David thus:

First, rewrite David's pseudo-code as follows:
X[1,.],..., X[n,.] - n elements of a (Hilbert) space with inner product <,>
K - matrix of size nxn

for i=1 to n
for j=1 to n
K[i,i] = % a linear dot product
endfor
endfor


To make this work for a sample of size n of m features: c_1, c_2, ..., c_n, apply the pseudo code to the result of transforming the feature vectors according to a function f, i.e. apply the pseudo-code to X[1,] = f(c_1) , X[2,] = f(c_2), ... , X[n,]=f(c_n).

Here is the function: let c be an m dimensional vector. To c we will assign an element in an infinite dimensional space, a space of functions defined in m dimensional vectors. f(c) is a function of another variable h defined as:
f(c)(h) = exp( - (||c - h||^2)/(2*sigma^2) )

Here is the definition of the inner product (all technicalities aside):
if A and B are (sufficiently nice) functions of h in R^m:
= integral over R^m of A(h)B(h) dh

Now, it is a long exercise to verify that:
= exp( - (||x - c|...





Ask a Question
Discussion stats
  • 5 replies
  • 468 views
  • 0 likes
  • 4 in conversation