BookmarkSubscribeRSS Feed
Dicarlis
Fluorite | Level 6

My doubt is whether it is correct for me to consider the two factors of my experiment to remove the autiliers or do I just consider the main factor, follow the model I thought of using:

proc glm;

class BLOC factor1 factor2;

model GMDadap = BLOC  factor1 factor1 * factor2;

output out = dois residual = x_res student = x_stu ;

run;

proc print;

run;

proc means n mean;

class factor1;

var gmdadap;

run;

proc univariate normal plot data=dois;

histogram x_stu/normal;

var x_stu ;

run;

5 REPLIES 5
ballardw
Super User

Outliers for what variable(s)?

Remove for which step?

What constitutes an "outlier", as in rule(s), for each variable?

Dicarlis
Fluorite | Level 6
Hello, thank you for your response. In our case, the variables are related to the performance of confined cattle (for example, average weight gain over 96 days). What we do is transform the residues to the Student scale (studentized residues) and we consider outiliers to be studentized residues greater than 3 or smaller than -3. however, we only performed this procedure if the residuals were not within normal limits using the Shapirowilk test, also performed for studentized residuals.
PaigeMiller
Diamond | Level 26

This is a reasonable way to approach outlier detection. Of course, there are plenty of other methods, including methods if the data is not normally distributed (such as box plot outliers), and even multivariate outlier detection. And possibly dozens of other methods.

--
Paige Miller
Dicarlis
Fluorite | Level 6
Hello, thanks for the answers, this question was generated because some colleagues, when working with experiments in a factorial arrangement, only consider the main factor to remove outliers, however I believe that the interaction between the factors (which makes up our treatments) that must be considered, is this line of thinking right?
PaigeMiller
Diamond | Level 26

There's no universally agreed upon method of detecting outliers. I think if you are going to fit a model with two factors, the outliers in Y ought to be detected via residuals, which means to me that all terms in the model should be used.

--
Paige Miller

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 284 views
  • 2 likes
  • 3 in conversation