Kernals in Support Vector Machines

SlutskyFan · Posted 11-10-2010 02:23 PM

This is more theoretical than SAS related:

I continuously see kernel functions in SVMs as k(xi,xj), or k(x,y) or
k(x,z)

Are kernel functions in SVMs functions (dot products) of independent
variables (x's) or a function of independent and dependent
variables(x,y)?

This also confuses my understanding of the relationship between kernel
functions and kernel matrices.

Any help?

I currently have SAS EM 6.2. As far as I know SVMs aren't possible with this version. Any news on their implementation in future releases?

-thanks

WayneThompson · Posted 11-10-2010 05:31 PM

An experimenta linear and nonlinear kernel SVM node is planned for EM 7.1 SAS9.3 mid next year. Thanks for you use of the software. There are quite a few other classification and prediction tools you will want to try. Some users have reported good Gradient Boosting results. Will keep you and the forum up to date.

SlutskyFan · Posted 11-17-2010 11:14 AM

Thanks so much. Looking forward to it. I've actually got my hands full dealing with all that is available in EM anyway- but I like to stay ahead of the curve.

DavidR_Hardoon · Posted 11-15-2010 10:53 PM

Hi Slutsky,

To answer your query, the various notations you come across in the kernel literature is dependent on the user group. From a notation perspective x, y (or z) refer to vectors where as xi and xj refers to element i and j within vector x.

A kernel matrix is usually donated with capital K whereas k is the kernel function (dot product). Hence, where as K is a matrix k(x,y) will be only an entry scalar in matrix K.

To answer your second question regarding dependence and independence. The assumption is that the data is iid (identically and identically distributed).

Perhaps this pseudocode will help understand the notation
X - matrix of size nxm (n sample and m features)
K - matrix of size nxn

for i=1 to n
for j=1 to n
K[i,i] = X[i,:]*X[j,:] % a linear dot product
endfor
endfor

a good book is 'introduction to support vector machine' by cristinanini and shawe-taylor

SlutskyFan · Posted 11-17-2010 11:07 AM

Thanks, this really helps. I think I'm getting a better picture. You said:

"where as K is a matrix k(x,y) will be only an entry scalar in matrix K."

I think I understand, but based on what you said is my following interpretation correct?

#1 k(x,y) is a kernal function (which produces a scaler that becomes and entry in matrix K)

If that is true, then does k(x,y) produce a dot product only between xi and xj (elements of the matrix X) or are they dot products also between x and y?

I'm thinking the entries in the kernal matrix are only dot products of xi and xj given your pseudocode , and y is just a 'label'.

But, I've also seen 'kernal functions' depicted in 2 different ways:

gaussian kernal: k(x,y) = exp(-||x-y||^2 / sigma^2)

gaussian kernal: k(xi,xj) = exp(-||xi-xj||^2 / sigma^2)

So I'm still confused on the notation about what are the 'inputs' into the kernel function, are the elements only of some matrix X, or can they also contain elements of Y?

Thanks.

VictorZurkowski · Posted 02-16-2011 03:34 PM

SlutskyFan:
The Gaussian kernel that you mentioned fits in the scheme set by David thus:

First, rewrite David's pseudo-code as follows:
X[1,.],..., X[n,.] - n elements of a (Hilbert) space with inner product <,>
K - matrix of size nxn

for i=1 to n
for j=1 to n
K[i,i] = % a linear dot product
endfor
endfor

To make this work for a sample of size n of m features: c_1, c_2, ..., c_n, apply the pseudo code to the result of transforming the feature vectors according to a function f, i.e. apply the pseudo-code to X[1,] = f(c_1) , X[2,] = f(c_2), ... , X[n,]=f(c_n).

Here is the function: let c be an m dimensional vector. To c we will assign an element in an infinite dimensional space, a space of functions defined in m dimensional vectors. f(c) is a function of another variable h defined as:
f(c)(h) = exp( - (||c - h||^2)/(2*sigma^2) )

Here is the definition of the inner product (all technicalities aside):
if A and B are (sufficiently nice) functions of h in R^m:
= integral over R^m of A(h)B(h) dh

Now, it is a long exercise to verify that:
= exp( - (||x - c|...

Kernals in Support Vector Machines

Re: Kernals in Support Vector Machines

Re: Kernals in Support Vector Machines

Re: Kernals in Support Vector Machines

Re: Kernals in Support Vector Machines

Re: Kernals in Support Vector Machines