03-03-2017 11:05 AM
Just getting my hands on 9.4M3, and I'm anxious to play with the broken axis feature. This is mainly trolling for advice.
Say I have a stored process that makes a box-and-whisker plot. And the plot looks fine for months. Then one day, there is a group with an extreme outlier, and the y axis is now scaled to fit the outlier and all the boxes are now shrunk to being unreadable. So the users call me and say "the chart broke." This is a setting where I think I'd like to employ a broken axis, so that I can have the boxes look reasonable, and still show the extreme outlier. And I want to dynamically determine the location of the break. Curious if people have approaches they like for choosing when and where to break an axis.
I'm imaging something like:
This is mostly a thought exercise at this point. Since I'm not providing sample data/code, not expecting anybody to code up something for me.
But just looking for thoughts on how people have approached the idea of dynamically determining where to break an axis, particularly in the box plot setting.
03-03-2017 11:19 AM
Your ideas sound reasonable. I think max(median +/- k*IQR) would be an effective range, and I'd try k=10 for starters. My intuition is that a graph should have 0 or 1 breaks. I wouldn't be fond of multiple breaks, although if your data can have upper AND lower outliers, 1 break in the positive and 1 break in the negative direction would probably be fine.
03-03-2017 11:25 AM
Sanjay told me he attended a paper at PharmaSug in 2016 that dealt with finding optimum locations for axis breaks. We were able to find the paper online:
Hope this helps!