Quartz | Level 8

## Detecting outliers

Hello all,

I am trying to apply the code from @Rick_SAS:"Detecting outliers in SAS: Part 3: Multivariate location and scatter".

article,

Detecting Outliers in SAS: Part 3

My data has 7 numerical independent variables. For mydata, The output result from " print outIdx; " is a table with1 row and 21 columns in which each value is the observation's number that is outlier ( as I understood, please guide me if I am wrong!).

I do not understand the number 3 inside bracket at this part of the code :

outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;

and I do not understand the number 8 inside "`optn = j(8,1,.); /* default options for MCD */".`

Appreciate you all to help me understand these concepts.

``````proc iml;
use mydata;
read all var{ T1  T2  T3  T4  T5  T6  T7  } into x ;
/* classical estimates */
labl = {"T1"  "T2"  "T3"  "T4"  "T5"  "T6"  "T7" };
mean = mean(x);
cov = cov(x);
print mean[c=labl format=5.2], cov[r=labl c=labl format=5.2];

N = nrow(x);   /* 60 observations */
p = ncol(x);   /*  7 variables */

optn = j(8,1,.); /* default options for MCD */
optn[1] = 0;     /* =1 if you want printed output */
optn[4]= floor(0.75*N); /* h = 75% of obs */

call MCD(sc, est, dist, optn, x);
RobustLoc = est[1, ];     /* robust location */
RobustCov = est[3:2+p, ]; /* robust scatter matrix */
print RobustLoc[c=labl format=5.2], RobustCov[r=labl c=labl format=5.2];

outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;``````

1 ACCEPTED SOLUTION

Accepted Solutions
SAS Super FREQ

## Re: Detecting outliers

You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:

``````data mydata;
set mydata;
y=ranuni(3);
run;
proc robustreg data=a method=lts;
model y = t1-t7 / diagnostics leverage;
run;
``````
2 REPLIES 2
SAS Super FREQ

## Re: Detecting outliers

You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:

``````data mydata;
set mydata;
y=ranuni(3);
run;
proc robustreg data=a method=lts;
model y = t1-t7 / diagnostics leverage;
run;
``````
Super User

## Re: Detecting outliers

"I do not understand the number 3 inside bracket at this part of the code :
outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;
"
3 stands for the third row. the code get the index of the third row = 0 .

"
and I do not understand the number 8 inside "optn = j(8,1,.); /* default options for MCD */".
Appreciate you all to help me understand these concepts.
"
The code create a 8*1 matrix ( 8 rows and 1 column), and its initial value are all missing .
From The DO Loop