BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Denote Message was edited by: ljw5122
1 REPLY 1
Rick_SAS
SAS Super FREQ
I had a homework problem like this once...

You asked for an understanding of the contaminated normal distribution. The contaminated normal is often used in testing the robustness of statistics, and I think the most natural way to think about it is to think about sampling from the pdf.

Let's suppose you want to compare the behavior of the standard deviation versus the interquartile range in the presence of outliers. One way to do this is to simulate, say, 1000 univariate values, with roughly 950 being from N(0,1) and 50 being from N(0, 100). This is an example of a contaminated normal with 5% contamination. (It's also an example of a mixture distribution.) The PDF will look somewhat like a normal distribution, except that the tails will be fatter.

How would I generate a sample in IML? Well, I'd generate n from a binomial distribution with 1000 trials and probability p of being an outlier.
Then I'd sample the data from two distibutions:
x1 = j(1000-n, 1);
call randgen(x1, "normal", 0, 1); /** most of the data **/
x2 = j(n, 1);
call randgen(x2, "normal", 0, 100); /** outliers **/

The vector x1//x2 contains data sampled from the contaminated normal pdf.

To geneate the pdf (or cdf) directly is a simple one-liner that uses the PDF (or CDF) function in Base SAS.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 1 reply
  • 897 views
  • 0 likes
  • 2 in conversation