Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Outliers in skewed data

Reply
Occasional Contributor
Posts: 15

Outliers in skewed data

Hi All,

 

Just wondering if there are ways we can find outliers in skewed(left or right) data? Just wondering if the tranformation of data(say log transformation and there by making data normal) helps in finding outliers?

 

Thanks,

Srihari

Contributor
Posts: 33

Re: Outliers in skewed data

A number of plots, easily done on SAS Enterprise guide and base SAS, will help you spot outliers. Boxplots in particular and scatter plots.

 

However, to spot outliers on a Enterprise Miners, use the explore option to explore the distribution of the variables. Then by increasing the number of bins of the histogram to say 100, you will be able to spot outliers

 

Hope this helps,

Paul

Occasional Contributor
Posts: 15

Re: Outliers in skewed data

Hi, Thanks for the reply. I dont use Enterprise miner. Box plot is perfect if the data has normal distribution not sure if it is right tool for skewed data. Wondering if transformation of data might help.

 

Thanks,

Srihari

Super User
Posts: 10,516

Re: Outliers in skewed data


Srihari40 wrote:

Hi, Thanks for the reply. I dont use Enterprise miner. Box plot is perfect if the data has normal distribution not sure if it is right tool for skewed data. Wondering if transformation of data might help.

 

Thanks,

Srihari


I actually find box-plots MORE useful for skewed data. When the mean and median indicators don't align that shows one bit of the skewness. The lengths of IQR fences will also show the direction of skew and then the distribution of the extreme outliers.

Occasional Contributor
Posts: 15

Re: Outliers in skewed data

Hi, Thanks for the reply. 

 

I've attached the screen shot where I've attached skewed data variable and its transformed version. Wondering which one is better in figuring out the outliers?

 

Thanks,

Srihari

Ask a Question
Discussion stats
  • 4 replies
  • 134 views
  • 0 likes
  • 3 in conversation