BookmarkSubscribeRSS Feed
Srihari40
Obsidian | Level 7

Hi All,

 

Just wondering if there are ways we can find outliers in skewed(left or right) data? Just wondering if the tranformation of data(say log transformation and there by making data normal) helps in finding outliers?

 

Thanks,

Srihari

4 REPLIES 4
frupaul
Quartz | Level 8

A number of plots, easily done on SAS Enterprise guide and base SAS, will help you spot outliers. Boxplots in particular and scatter plots.

 

However, to spot outliers on a Enterprise Miners, use the explore option to explore the distribution of the variables. Then by increasing the number of bins of the histogram to say 100, you will be able to spot outliers

 

Hope this helps,

Paul

Srihari40
Obsidian | Level 7

Hi, Thanks for the reply. I dont use Enterprise miner. Box plot is perfect if the data has normal distribution not sure if it is right tool for skewed data. Wondering if transformation of data might help.

 

Thanks,

Srihari

ballardw
Super User

@Srihari40 wrote:

Hi, Thanks for the reply. I dont use Enterprise miner. Box plot is perfect if the data has normal distribution not sure if it is right tool for skewed data. Wondering if transformation of data might help.

 

Thanks,

Srihari


I actually find box-plots MORE useful for skewed data. When the mean and median indicators don't align that shows one bit of the skewness. The lengths of IQR fences will also show the direction of skew and then the distribution of the extreme outliers.

Srihari40
Obsidian | Level 7

Hi, Thanks for the reply. 

 

I've attached the screen shot where I've attached skewed data variable and its transformed version. Wondering which one is better in figuring out the outliers?

 

Thanks,

Srihari

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2094 views
  • 0 likes
  • 3 in conversation