Solved
Contributor
Posts: 62

# Create normalize distances from euclidean distances

Hello,

I've created euclidean distances with proc distance:

proc distance data=have out=want method=euclid;

id bank_id;

by_rate_ year;

run;

I want to transform euclidean distances into normalized ones (i.e. that vary between 0 and 1).

I've read on the website that I can add directly the norm option with the proc distance to normalize. How does it work? Because my data are already weights ready to ''enter'' in the proc distance. I do not see how I can include a norm option in the proc distance. I must certainly ''normalize'' my data in previous tables. My previous table has mainly 11 columns. I have a column for an identification number and 10 columns (for 10 different industries) which are amounts. I sum each amount by industry, by identification number and by year. Also, I've created a total from which I create ten weights thereafter.

Alternatively, I've tried to create plots of ''normal'' variables, to normalize with a proc standard, ... I do not have succeeded in this way.

How can I incorporate the norm option for normalizing my distances? It seems that this is the simplest way to procede.

Accepted Solutions
Solution
‎08-04-2014 01:16 PM
Posts: 1,270

## Re: Create normalize distances from euclidean distances

I would say just exclude those variables from the analysis that have mean and standard deviation equal to 0

All Replies
Posts: 1,270

## Re: Create normalize distances from euclidean distances

Hi,

Just add /std=std in var statement.

proc distance data=have out=want method=euclid;

id bank_id;

by_rate_ year;

run;

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

Hi,

I obtain this warning in the log:

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

I compare the two outputs (the first with my initial code and the second with your suggestion) and I obtain exactly the same distances.

Posts: 1,270

## Re: Create normalize distances from euclidean distances

Not sure, what is contained in W_Public variable? Is that interval variable?

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

I will show you what I've previously done with my data. I think it will be clearer.

1) I've a table that looks like that:

 _rate_ year LenderID Industrie_AG Industrie_MIN Industrie_CON Industrie_TRAN Industrie_WHOLE Industrie_RE Industrie_FIN Industrie_SER Industrie_PUBLIC Industrie_Other 0,1 1995 23 5000000 0 0 0 0 0 0 0 0 0 0,1 1995 45 0 78689690 0 0 0 0 0 0 0 0 0,1 1995 3 0 0 0 0 7600000 0 0 0 0 0 0,1 1995 500 0 0 0 890655550 0 0 0 0 0 0 0,1 1995 677 0 0 0 590000000 0 0 0 0 0 0

So, it's a table that contains loans (Only one lender ''lend'' to a borrower. It contains bilateral conventional loans) . I have ten industries, eighteen years and eight sample rates (it's for a subsequent sampling). The variable LenderID identifies each lender.

2) I've applied this code to sum

proc means data=pf15 noprint;

var Industrie_AG Industrie_MIN Industrie_CON Industrie_TRAN Industrie_WHOLE Industrie_RE Industrie_FIN Industrie_SER Industrie_PUBLIC Industrie_Other;

outputout=pf15_1(drop=_type_ _freq_)

sum(Industrie_AG Industrie_MIN Industrie_CON Industrie_TRAN Industrie_WHOLE Industrie_RE Industrie_FIN Industrie_SER Industrie_PUBLIC Industrie_Other)=sum_AG sum_MIN sum_CON sum_TRAN sum_WHOLE sum_RE sum_FIN sum_SER sum_PUBLIC sum_Other;

by _rate_ year LenderID;

run;

So, I've added each amount by _rate_, year and LenderID. So, I have aggregate amounts by industry.

3) I've created a total (first code) and weights (second code).

proc sql;

create table pf15_2 as

select *, sum(sum_AG,sum_MIN,sum_CON,sum_TRAN, sum_WHOLE, sum_RE, sum_FIN, sum_SER, sum_PUBLIC, sum_other) as sum_tot

from pf15_1;

quit;

proc sql;

create table pf15_3 as

select distinct _rate_, year, LenderID, (sum_AG/sum_tot)as W_AG, (sum_MIN/sum_tot) as W_MIN, (sum_CON/sum_tot) as W_CON,  (sum_TRAN/sum_tot) as W_TRAN,

(sum_WHOLE/sum_tot) as W_WHOLE, (sum_RE/sum_tot) as W_RE, (sum_FIN/sum_tot) as W_FIN, (sum_SER/sum_tot) as W_SER,(sum_PUBLIC/sum_tot) as W_PUBLIC,

sum_Other/sum_tot) as W_Other

from pf15_2;

quit;

4) We arrive to the code I initially present. I create a character id with the variable LenderID (first code) and I use the proc distance (second code).

/*step1 create a character id*/

data pf15_5;

set pf15_4;

bank_id="_"||put(LenderID, best8.);

run;

proc distance data=pf15_5out=pf15_6 method=euclid;

var interval; /* use w_: to represent the group of all variables starting with w_*/

id bank_id;

by _rate_ year;

run;

I've transformed my data to use the proc distance in this way. I must certainly normalize in a previous table, I guess.

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

A little error...

data pf15_4;

set pf15_3;

bank_id="_"||put(LenderID, best8.);

run;

Posts: 1,270

## Re: Create normalize distances from euclidean distances

Thanks for presenting the problem in detail. Let us go back to main question to feed standardize variables to proc distance. If you could provide info on W_PUBLIC variable (causing warning message) would help to understand why you are getting warning message as a result of proc distance. Proc stdize will generate the same warning message if W_PUBLIC is not eligible to standardize. I would suggest use proc means for W_PUBLIC to get the following

proc means data=pf15_3 n nmiss mean std;

var W_PUBLIC;

run;

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

This is precisely the warnings I obtain from the log. See summary above.

proc distance data=pf15_5 out=pf15_6 method=euclid;

var interval (w_:/std=std); /* use w_: to represent the group of all variables starting with w_*/

id bank_id;

by _rate_year;

run;

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.1 Year=2005

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.1 Year=2006

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.1 Year=2009

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.1 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group:_RATE_=0.25 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: Th above message was for the following BY group: _RATE_=0.33333 Year=2005

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.33333 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.5 Year=2005

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.5 Year=2006

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.5 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.66667 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.75 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group: _RATE_=0.9 Year=2012

WARNING: The scale estimator for variable W_PUBLIC is less than or equal to 0.  W_PUBLIC will not be standardized.

WARNING: Because some variables can not be standardized, PROC DISTANCE will not compute the distance matrix. Choose other standardization methods for those variables.

NOTE: The above message was for the following BY group:_RATE_=1 Year=2012

NOTE:
OUT= data set is not created.

NOTE: PROCEDURE DISTANCE used (Total process time):

real time  0.69 seconds

cpu time  0.39 seconds

I have _rate_ which takes the values 0.1, 0.25, 0.3333, 0.50, 0.66667, 0.75, 0.9 and 1. It came from samplings (_rate_ is the sampling rate).

So, problems arise from W_PUBLIC (_rate_ 0.1, 0.25,0.3333, 0.50, 0.66667, 0.75, 0.9 and 1) from different years:

_rate_=0.1 => 2005, 2006, 2009 and 2012

_rate_=0.25 => 2012

_rate_=0.33333 => 2005 and 2012

_rate_=0.50 => 2005, 2006 and 2012

_rate_=0.66667 => 2012

_rate_=0.75=> 2012

_rate_=0.90=> 2012

_rate_=1  => 2012

Because W_PUBLIC gives warnings for different sampling rates (_rate_) and not always the same years, I execute your code by _rate_  year:

procmeans data=pf15_3n nmiss meanstd;

var W_PUBLIC;

by _rate_ year;

run;

The output is huge. See summary above.

 The SAS  System The  MEANS Procedure _RATE_=0.1 Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0081510 0.0167272 _RATE_=0.1  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0251548 0.0340587 _RATE_=0.1  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0197715 0.0209811 _RATE_=0.1  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0014156 0.0027333 _RATE_=0.1  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0095375 0.0183058 _RATE_=0.1  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0125907 0.0205180 _RATE_=0.1  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0016666 0.0031349 _RATE_=0.1  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0371616 0.0744343 _RATE_=0.1  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0173120 0.0424055 _RATE_=0.1  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0051048 0.0125040 _RATE_=0.1  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.1  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.1  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0151700 0.0300848 _RATE_=0.1  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0283977 0.0429333 _RATE_=0.1  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.1  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0081268 0.0126904 _RATE_=0.1  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0082858 0.0135355 _RATE_=0.1  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.25  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0147012 0.0184258 _RATE_=0.25  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0163410 0.0275292 _RATE_=0.25  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0133464 0.0154777 _RATE_=0.25  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0021127 0.0039143 _RATE_=0.25  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0049996 0.0062379 _RATE_=0.25  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0013248 0.0020659 _RATE_=0.25  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0074248 0.0100632 _RATE_=0.25  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0067791 0.0087785 _RATE_=0.25  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0017136 0.0039769 _RATE_=0.25  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0397329 0.0598700 _RATE_=0.25  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0219861 0.0538548 _RATE_=0.25  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0149599 0.0366442 _RATE_=0.25  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0071786 0.0098935 _RATE_=0.25  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0094042 0.0115324 _RATE_=0.25  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0533573 0.1149143 _RATE_=0.25  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0240404 0.0337330 _RATE_=0.25  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0027859 0.0037937 _RATE_=0.25  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.33333  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0104859 0.0125634 _RATE_=0.33333  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0035480 0.0068107 _RATE_=0.33333  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0165512 0.0205497 _RATE_=0.33333  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0019953 0.0037752 _RATE_=0.33333  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0011410 0.0018540 _RATE_=0.33333  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0051117 0.0112255 _RATE_=0.33333  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0027246 0.0057535 _RATE_=0.33333  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0066935 0.0068624 _RATE_=0.33333  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.000339390 0.000625208 _RATE_=0.33333  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0061084 0.0094970 _RATE_=0.33333  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.33333  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0128083 0.0313739 _RATE_=0.33333  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0065704 0.0057434 _RATE_=0.33333  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0215321 0.0189579 _RATE_=0.33333  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0013898 0.0020662 _RATE_=0.33333  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0065392 0.0084608 _RATE_=0.33333  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0019434 0.0017686 _RATE_=0.33333  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.5  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0406940 0.0492116 _RATE_=0.5  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0131005 0.0177264 _RATE_=0.5  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0508180 0.0715886 _RATE_=0.5  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0035499 0.0075979 _RATE_=0.5  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0048653 0.0075474 _RATE_=0.5  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0078813 0.0091853 _RATE_=0.5  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0266625 0.0485095 _RATE_=0.5  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0200520 0.0356918 _RATE_=0.5  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0046599 0.0088813 _RATE_=0.5  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0238068 0.0439375 _RATE_=0.5  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.5  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.5  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0124609 0.0092889 _RATE_=0.5  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0183643 0.0157922 _RATE_=0.5  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0235013 0.0542259 _RATE_=0.5  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0175879 0.0126084 _RATE_=0.5  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0050698 0.0061390 _RATE_=0.5  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.66667  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0337743 0.0313972 _RATE_=0.66667  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0143465 0.0130083 _RATE_=0.66667  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0451269 0.0536985 _RATE_=0.66667  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0034471 0.0042790 _RATE_=0.66667  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0055626 0.0057445 _RATE_=0.66667  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0075782 0.0088977 _RATE_=0.66667  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0059808 0.0097095 _RATE_=0.66667  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0163921 0.0257660 _RATE_=0.66667  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0054184 0.0070632 _RATE_=0.66667  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0069423 0.0081281 _RATE_=0.66667  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0101080 0.0247594 _RATE_=0.66667  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0102748 0.0126101 _RATE_=0.66667  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0120285 0.0112328 _RATE_=0.66667  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0182268 0.0079895 _RATE_=0.66667  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0031703 0.0021238 _RATE_=0.66667  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0183829 0.0180565 _RATE_=0.66667  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0056804 0.0049754 _RATE_=0.66667  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.75  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0350459 0.0391202 _RATE_=0.75  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0183494 0.0124532 _RATE_=0.75  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0375901 0.0455889 _RATE_=0.75  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0030824 0.0054692 _RATE_=0.75  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0059878 0.0058065 _RATE_=0.75  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0056177 0.0063520 _RATE_=0.75  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0243148 0.0283375 _RATE_=0.75  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0049738 0.0062295 _RATE_=0.75  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0052701 0.0057973 _RATE_=0.75  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0186519 0.0248487 _RATE_=0.75  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0097732 0.0239392 _RATE_=0.75  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0105775 0.0131181 _RATE_=0.75  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0135003 0.0143191 _RATE_=0.75  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0194731 0.0191022 _RATE_=0.75  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0206636 0.0430237 _RATE_=0.75  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0166414 0.0132600 _RATE_=0.75  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0039618 0.0050535 _RATE_=0.75  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=0.9  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0296786 0.0270912 _RATE_=0.9  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0161963 0.0130608 _RATE_=0.9  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0356798 0.0405470 _RATE_=0.9  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0037732 0.0052966 _RATE_=0.9  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0056002 0.0049333 _RATE_=0.9  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0065381 0.0074156 _RATE_=0.9  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0185660 0.0195838 _RATE_=0.9  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0118602 0.0160331 _RATE_=0.9  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0083679 0.0093528 _RATE_=0.9  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0160113 0.0187704 _RATE_=0.9  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0096588 0.0236592 _RATE_=0.9  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0089604 0.0110970 _RATE_=0.9  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0094993 0.0037775 _RATE_=0.9  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0182513 0.0128379 _RATE_=0.9  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0186634 0.0346475 _RATE_=0.9  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0152428 0.0102962 _RATE_=0.9  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0041323 0.0036972 _RATE_=0.9  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0 _RATE_=1  Year=1995 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0298176 0.0260602 _RATE_=1  Year=1996 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0162599 0.0115162 _RATE_=1  Year=1997 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0342501 0.0369118 _RATE_=1  Year=1998 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0036560 0.0051358 _RATE_=1  Year=1999 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0055724 0.0049559 _RATE_=1  Year=2000 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0065372 0.0075367 _RATE_=1  Year=2001 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0178894 0.0189897 _RATE_=1  Year=2002 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0120096 0.0145428 _RATE_=1  Year=2003 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0079442 0.0087244 _RATE_=1  Year=2004 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0147892 0.0175250 _RATE_=1  Year=2005 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0095688 0.0234388 _RATE_=1  Year=2006 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0089033 0.0109716 _RATE_=1  Year=2007 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0128167 0.0115824 _RATE_=1  Year=2008 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0174303 0.0124328 _RATE_=1  Year=2009 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0190555 0.0356186 _RATE_=1  Year=2010 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0150235 0.0102565 _RATE_=1  Year=2011 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0.0042406 0.0032645 _RATE_=1  Year=2012 Analysis  Variable : W_PUBLIC N N Miss Mean Std Dev 6 0 0 0

When I obtain a warning message in the log, this is because mean and standard deviation equal 0. Some years, depending on the sampling rate, I must have not even a single loan in the public industry.

It's obvious now why had multiple warning messages. Can we circumvent this problem?

Solution
‎08-04-2014 01:16 PM
Posts: 1,270

## Re: Create normalize distances from euclidean distances

I would say just exclude those variables from the analysis that have mean and standard deviation equal to 0

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

Thank you so much for your time.

And lastly, where I should use the UNDEF option in the proc distance?

(I ask because sometimes, it's surprising how much time it takes for me to include correctly a simple option into an existing code that I create. Ah, the learning curve in coding )

Posts: 1,270

## Re: Create normalize distances from euclidean distances

In proc distance options

Contributor
Posts: 62

Thank you!

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

Sorry again.

I've added the UNDEF option and I obtain the same warnings in the log. Maybe, it needs an accurate number in the UNEDEF option to work?

Posts: 1,270

## Re: Create normalize distances from euclidean distances

Why are you using UNDEF option?

Contributor
Posts: 62

## Re: Create normalize distances from euclidean distances

I thought that this option would resolve my problem because the description of the UNDEF option is: ''specifies the numeric constant used to replace undefined distances''.

My data are not missing so the options NOMISS, REPLACE OR REPONLY are not relevant.

SAS/STAT(R) 9.22 User's Guide

I was looking in the proc distance  <options> to find a solution to exclude those variables with mean and variance equal to 0.

🔒 This topic is solved and locked.