turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Studio
- /
- Euclidean length option for Standardization method...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2016 01:09 PM

While working with the Standardize Data Task in SAS Studio 3.5 I've come across 'Euclidean length' as a standardisation method. I understand how z-scores are obtained by subtracting mean from each observation and dividing the result by standard deviation. What is Euclidean distance and how does it help in standardisation?

For example, if we are using the sashelp.baseball dataset, what would using the 'Euclidean length' method of standardisation for the 'nhits' (number of hits) variable do for us?

Accepted Solutions

Solution

08-12-2016
10:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2016 09:54 AM

A SAS Studio task generates SAS code, usually in the form of a call to a SAS procedure. If you click on the **Code** tab, you can see the program. In this case, the call is to PROC STDIZE and the METHOD=EUCLEN option is specified.

So the general way to answer the question "What does a task do?" is to

1. Go to the SAS/STAT User's Guide documentation.

2. Scroll down and click on the doc for the relevant procedure.

For this question here is a link to the formulas that are applied for each method. For the EUCLEN option, the location is 0 and the scale is the Euclidean length of the variable:

scale = sqrt(ssq(x)) = sqrt( x1**2 + x2**2 + ... + xN**2 ),

where N is thenum ber of observations in the sample.

The new variable is therefore

X_New[i] = (X[i] - 0) / scale

The transformation has the property that the new variable has unit Euclidean length. Geometrically, you can think of the transformation as a projection onto the surface of the unit N-dimensional sphere. This transformation might be useful for spherically symmetric problems in which the angle that the observation makes with the origin is important.

All Replies

Solution

08-12-2016
10:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2016 09:54 AM

A SAS Studio task generates SAS code, usually in the form of a call to a SAS procedure. If you click on the **Code** tab, you can see the program. In this case, the call is to PROC STDIZE and the METHOD=EUCLEN option is specified.

So the general way to answer the question "What does a task do?" is to

1. Go to the SAS/STAT User's Guide documentation.

2. Scroll down and click on the doc for the relevant procedure.

For this question here is a link to the formulas that are applied for each method. For the EUCLEN option, the location is 0 and the scale is the Euclidean length of the variable:

scale = sqrt(ssq(x)) = sqrt( x1**2 + x2**2 + ... + xN**2 ),

where N is thenum ber of observations in the sample.

The new variable is therefore

X_New[i] = (X[i] - 0) / scale

The transformation has the property that the new variable has unit Euclidean length. Geometrically, you can think of the transformation as a projection onto the surface of the unit N-dimensional sphere. This transformation might be useful for spherically symmetric problems in which the angle that the observation makes with the origin is important.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2016 10:19 AM

Hi @Rick_SAS, thanks very much for the detailed explanation. I will read through the documentation for the PROC STDIZE procedure for a better understanding.

Am I correct in assuming then that a transformation using the Euclidean Length would only be used for scientific / mathematical data and cannot be used in domains like marketing? If this is incorrect would there be an example from the marketing / business domain that you can point me to in which this transformation is used to analyse data and generate insight?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2016 10:30 AM

I am not familiar with marketing, so I can't answer your question. However, I will say that the METHOD=EUCLEN is more geeky/scientific than the more intuitive standard deviation.

It's not that strange, though. If your data are centered, then the formula for the standard deviation is closely related to the Euclidean length. The Euclidean length is sqrt(N-1) times longer than the standard deviation.