Hello all,
I am trying to apply the code from @Rick_SAS:"Detecting outliers in SAS: Part 3: Multivariate location and scatter".
article,
My data has 7 numerical independent variables. For mydata, The output result from " print outIdx; " is a table with1 row and 21 columns in which each value is the observation's number that is outlier ( as I understood, please guide me if I am wrong!).
I do not understand the number 3 inside bracket at this part of the code :
outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;
and I do not understand the number 8 inside "optn = j(8,1,.); /* default options for MCD */".
Appreciate you all to help me understand these concepts.
proc iml;
use mydata;
read all var{ T1 T2 T3 T4 T5 T6 T7 } into x ;
/* classical estimates */
labl = {"T1" "T2" "T3" "T4" "T5" "T6" "T7" };
mean = mean(x);
cov = cov(x);
print mean[c=labl format=5.2], cov[r=labl c=labl format=5.2];
N = nrow(x); /* 60 observations */
p = ncol(x); /* 7 variables */
optn = j(8,1,.); /* default options for MCD */
optn[1] = 0; /* =1 if you want printed output */
optn[4]= floor(0.75*N); /* h = 75% of obs */
call MCD(sc, est, dist, optn, x);
RobustLoc = est[1, ]; /* robust location */
RobustCov = est[3:2+p, ]; /* robust scatter matrix */
print RobustLoc[c=labl format=5.2], RobustCov[r=labl c=labl format=5.2];
outIdx = loc(dist[3,]=0); /* RD > cutoff */
print outIdx;
You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:
data mydata;
set mydata;
y=ranuni(3);
run;
proc robustreg data=a method=lts;
model y = t1-t7 / diagnostics leverage;
run;
You might find it easier to use PROC ROBUSTREG as suggested in the Outliers item in the list of Frequently Asked-for Statistics (FASTats) in the Important Links section of the Statistical Procedures Community page. Just add a random response variable. For example:
data mydata;
set mydata;
y=ranuni(3);
run;
proc robustreg data=a method=lts;
model y = t1-t7 / diagnostics leverage;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.