turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Kernals in Support Vector Machines

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-10-2010 02:23 PM

This is more theoretical than SAS related:

I continuously see kernel functions in SVMs as k(xi,xj), or k(x,y) or

k(x,z)

Are kernel functions in SVMs functions (dot products) of independent

variables (x's) or a function of independent and dependent

variables(x,y)?

This also confuses my understanding of the relationship between kernel

functions and kernel matrices.

Any help?

I currently have SAS EM 6.2. As far as I know SVMs aren't possible with this version. Any news on their implementation in future releases?

-thanks

I continuously see kernel functions in SVMs as k(xi,xj), or k(x,y) or

k(x,z)

Are kernel functions in SVMs functions (dot products) of independent

variables (x's) or a function of independent and dependent

variables(x,y)?

This also confuses my understanding of the relationship between kernel

functions and kernel matrices.

Any help?

I currently have SAS EM 6.2. As far as I know SVMs aren't possible with this version. Any news on their implementation in future releases?

-thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SlutskyFan

11-10-2010 05:31 PM

An experimenta linear and nonlinear kernel SVM node is planned for EM 7.1 SAS9.3 mid next year. Thanks for you use of the software. There are quite a few other classification and prediction tools you will want to try. Some users have reported good Gradient Boosting results. Will keep you and the forum up to date.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to WayneThompson

11-17-2010 11:14 AM

Thanks so much. Looking forward to it. I've actually got my hands full dealing with all that is available in EM anyway- but I like to stay ahead of the curve.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SlutskyFan

11-15-2010 10:53 PM

Hi Slutsky,

To answer your query, the various notations you come across in the kernel literature is dependent on the user group. From a notation perspective x, y (or z) refer to vectors where as xi and xj refers to element i and j within vector x.

A kernel matrix is usually donated with capital K whereas k is the kernel function (dot product). Hence, where as K is a matrix k(x,y) will be only an entry scalar in matrix K.

To answer your second question regarding dependence and independence. The assumption is that the data is iid (identically and identically distributed).

Perhaps this pseudocode will help understand the notation

X - matrix of size nxm (n sample and m features)

K - matrix of size nxn

for i=1 to n

for j=1 to n

K[i,i] = X[i,:]*X[j,:] % a linear dot product

endfor

endfor

a good book is 'introduction to support vector machine' by cristinanini and shawe-taylor

To answer your query, the various notations you come across in the kernel literature is dependent on the user group. From a notation perspective x, y (or z) refer to vectors where as xi and xj refers to element i and j within vector x.

A kernel matrix is usually donated with capital K whereas k is the kernel function (dot product). Hence, where as K is a matrix k(x,y) will be only an entry scalar in matrix K.

To answer your second question regarding dependence and independence. The assumption is that the data is iid (identically and identically distributed).

Perhaps this pseudocode will help understand the notation

X - matrix of size nxm (n sample and m features)

K - matrix of size nxn

for i=1 to n

for j=1 to n

K[i,i] = X[i,:]*X[j,:] % a linear dot product

endfor

endfor

a good book is 'introduction to support vector machine' by cristinanini and shawe-taylor

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidR_Hardoon

11-17-2010 11:07 AM

Thanks, this really helps. I think I'm getting a better picture. You said:

"where as K is a matrix k(x,y) will be only an entry scalar in matrix K."

I think I understand, but based on what you said is my following interpretation correct?

#1 k(x,y) is a kernal function (which produces a scaler that becomes and entry in matrix K)

If that is true, then does k(x,y) produce a dot product only between xi and xj (elements of the matrix X) or are they dot products also between x and y?

I'm thinking the entries in the kernal matrix are only dot products of xi and xj given your pseudocode , and y is just a 'label'.

But, I've also seen 'kernal functions' depicted in 2 different ways:

gaussian kernal: k(x,y) = exp(-||x-y||^2 / sigma^2)

gaussian kernal: k(xi,xj) = exp(-||xi-xj||^2 / sigma^2)

So I'm still confused on the notation about what are the 'inputs' into the kernel function, are the elements only of some matrix X, or can they also contain elements of Y?

Thanks.

"where as K is a matrix k(x,y) will be only an entry scalar in matrix K."

I think I understand, but based on what you said is my following interpretation correct?

#1 k(x,y) is a kernal function (which produces a scaler that becomes and entry in matrix K)

If that is true, then does k(x,y) produce a dot product only between xi and xj (elements of the matrix X) or are they dot products also between x and y?

I'm thinking the entries in the kernal matrix are only dot products of xi and xj given your pseudocode , and y is just a 'label'.

But, I've also seen 'kernal functions' depicted in 2 different ways:

gaussian kernal: k(x,y) = exp(-||x-y||^2 / sigma^2)

gaussian kernal: k(xi,xj) = exp(-||xi-xj||^2 / sigma^2)

So I'm still confused on the notation about what are the 'inputs' into the kernel function, are the elements only of some matrix X, or can they also contain elements of Y?

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SlutskyFan

02-16-2011 03:34 PM

SlutskyFan:

The Gaussian kernel that you mentioned fits in the scheme set by David thus:

First, rewrite David's pseudo-code as follows:

X[1,.],..., X[n,.] - n elements of a (Hilbert) space with inner product <,>

K - matrix of size nxn

for i=1 to n

for j=1 to n

K[i,i] = % a linear dot product

endfor

endfor

To make this work for a sample of size n of m features: c_1, c_2, ..., c_n, apply the pseudo code to the result of transforming the feature vectors according to a function f, i.e. apply the pseudo-code to X[1,] = f(c_1) , X[2,] = f(c_2), ... , X[n,]=f(c_n).

Here is the function: let c be an m dimensional vector. To c we will assign an element in an infinite dimensional space, a space of functions defined in m dimensional vectors. f(c) is a function of another variable h defined as:

f(c)(h) = exp( - (||c - h||^2)/(2*sigma^2) )

Here is the definition of the inner product (all technicalities aside):

if A and B are (sufficiently nice) functions of h in R^m:

= integral over R^m of A(h)B(h) dh

Now, it is a long exercise to verify that:

= exp( - (||x - c|...

The Gaussian kernel that you mentioned fits in the scheme set by David thus:

First, rewrite David's pseudo-code as follows:

X[1,.],..., X[n,.] - n elements of a (Hilbert) space with inner product <,>

K - matrix of size nxn

for i=1 to n

for j=1 to n

K[i,i] =

endfor

endfor

To make this work for a sample of size n of m features: c_1, c_2, ..., c_n, apply the pseudo code to the result of transforming the feature vectors according to a function f, i.e. apply the pseudo-code to X[1,] = f(c_1) , X[2,] = f(c_2), ... , X[n,]=f(c_n).

Here is the function: let c be an m dimensional vector. To c we will assign an element in an infinite dimensional space, a space of functions defined in m dimensional vectors. f(c) is a function of another variable h defined as:

f(c)(h) = exp( - (||c - h||^2)/(2*sigma^2) )

Here is the definition of the inner product (all technicalities aside):

if A and B are (sufficiently nice) functions of h in R^m:

= integral over R^m of A(h)B(h) dh

Now, it is a long exercise to verify that: