BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fatemeh
Quartz | Level 8

Hello all, 

I am trying to apply the code from @Rick_SAS:"Detecting outliers in SAS: Part 3: Multivariate location and scatter".

 article,

 Detecting Outliers in SAS: Part 3

 My data has 7 numerical independent variables. For mydata, The output result from " print outIdx; " is a table with1 row and 21 columns in which each value is the observation's number that is outlier ( as I understood, please guide me if I am wrong!).  

I do not understand the number 3 inside bracket at this part of the code :

 

 outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;

 

and I do not understand the number 8 inside "optn = j(8,1,.); /* default options for MCD */".

Appreciate you all to help me understand these concepts.   

 

proc iml;
use mydata;
read all var{ T1  T2  T3  T4  T5  T6  T7  } into x ;
/* classical estimates */
labl = {"T1"  "T2"  "T3"  "T4"  "T5"  "T6"  "T7" };
mean = mean(x);
cov = cov(x);
print mean[c=labl format=5.2], cov[r=labl c=labl format=5.2];

N = nrow(x);   /* 60 observations */
p = ncol(x);   /*  7 variables */
 
optn = j(8,1,.); /* default options for MCD */
optn[1] = 0;     /* =1 if you want printed output */
optn[4]= floor(0.75*N); /* h = 75% of obs */
 
call MCD(sc, est, dist, optn, x);
RobustLoc = est[1, ];     /* robust location */
RobustCov = est[3:2+p, ]; /* robust scatter matrix */
print RobustLoc[c=labl format=5.2], RobustCov[r=labl c=labl format=5.2];

outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;

 

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:

data mydata;
   set mydata;
   y=ranuni(3);
   run;
proc robustreg data=a method=lts;
   model y = t1-t7 / diagnostics leverage;
   run;

View solution in original post

2 REPLIES 2
StatDave
SAS Super FREQ

You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:

data mydata;
   set mydata;
   y=ranuni(3);
   run;
proc robustreg data=a method=lts;
   model y = t1-t7 / diagnostics leverage;
   run;
Ksharp
Super User
"I do not understand the number 3 inside bracket at this part of the code :
outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;
"
3 stands for the third row. the code get the index of the third row = 0 .


"
and I do not understand the number 8 inside "optn = j(8,1,.); /* default options for MCD */".
Appreciate you all to help me understand these concepts.
"
The code create a 8*1 matrix ( 8 rows and 1 column), and its initial value are all missing .

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 2 replies
  • 1152 views
  • 5 likes
  • 3 in conversation