BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

 

While working with the Standardize Data Task in SAS Studio 3.5 I've come across 'Euclidean length' as a standardisation method. I understand how z-scores are obtained by subtracting mean from each observation and dividing the result by standard deviation. What is Euclidean distance and how does it help in standardisation?

 

For example, if we are using the sashelp.baseball dataset, what would using the 'Euclidean length' method of standardisation for the 'nhits' (number of hits) variable do for us?

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

A SAS Studio task generates SAS code, usually in the form of a call to a SAS procedure.  If you click on the Code tab, you can see the program.  In this case, the call is to PROC STDIZE and the METHOD=EUCLEN option is specified.

 

So the general way to answer the question "What does a task do?" is to

1. Go to the SAS/STAT User's Guide documentation.

2. Scroll down and click on the doc for the relevant procedure.

 

For this question here is a link to the formulas that are applied for each method. For the EUCLEN option, the location is 0 and the scale is the Euclidean length of the variable:

scale = sqrt(ssq(x)) = sqrt( x1**2 + x2**2 + ... + xN**2 ),

where N is thenum ber of observations in the sample.

The new variable is therefore 

X_New[i] = (X[i] - 0) / scale

 

The transformation has the property that the new variable has unit Euclidean length. Geometrically, you can think of the transformation as a projection onto the surface of the unit N-dimensional sphere. This transformation might be useful for spherically symmetric problems in which the angle that the observation makes with the origin is important.

View solution in original post

3 REPLIES 3
Rick_SAS
SAS Super FREQ

A SAS Studio task generates SAS code, usually in the form of a call to a SAS procedure.  If you click on the Code tab, you can see the program.  In this case, the call is to PROC STDIZE and the METHOD=EUCLEN option is specified.

 

So the general way to answer the question "What does a task do?" is to

1. Go to the SAS/STAT User's Guide documentation.

2. Scroll down and click on the doc for the relevant procedure.

 

For this question here is a link to the formulas that are applied for each method. For the EUCLEN option, the location is 0 and the scale is the Euclidean length of the variable:

scale = sqrt(ssq(x)) = sqrt( x1**2 + x2**2 + ... + xN**2 ),

where N is thenum ber of observations in the sample.

The new variable is therefore 

X_New[i] = (X[i] - 0) / scale

 

The transformation has the property that the new variable has unit Euclidean length. Geometrically, you can think of the transformation as a projection onto the surface of the unit N-dimensional sphere. This transformation might be useful for spherically symmetric problems in which the angle that the observation makes with the origin is important.

DataScientist
Quartz | Level 8

Hi @Rick_SAS, thanks very much for the detailed explanation. I will read through the documentation for the PROC STDIZE procedure for a better understanding.

 

Am I correct in assuming then that a transformation using the Euclidean Length would only be used for scientific / mathematical data and cannot be used in domains like marketing? If this is incorrect would there be an example from the marketing / business domain that you can point me to in which this transformation is used to analyse data and generate insight?

Rick_SAS
SAS Super FREQ

I am not familiar with marketing, so I can't answer your question. However, I will say that the METHOD=EUCLEN is more geeky/scientific than the more intuitive standard deviation.  

 

It's not that strange, though. If your data are centered, then the formula for the standard deviation is closely related to the Euclidean length.  The Euclidean length is sqrt(N-1) times longer than the standard deviation.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1004 views
  • 3 likes
  • 2 in conversation