turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS/GRAPH and ODS Graphics
- /
- Boxplot (multiple variable) connecting stats using...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-19-2016 01:48 AM

Following the script given in http://blogs.sas.com/content/graphicallyspeaking/2015/12/06/boxplot-with-connect-using-annotate/, I managed to create a boxplot with multiple connect lines for Q1 and Q3 statistics running through 10 categories. However, I need to repeat this for several other response variables. Is there an easy way to create multiple boxplots with Annotate? My attempt in creating three boxplots with one sgplot code (using the by statement) failed miserably as only one boxplot was created but with 6 lines going through the different categories. Can anybody help? Thanks!

proc sgplot data=residrank sganno=sganno noautolegend;

vbox student / category=rank_linearpred;

by resp_name;

run;

Accepted Solutions

Solution

01-19-2016
08:11 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

01-19-2016 05:12 PM - edited 01-19-2016 05:12 PM

Maybe there is a simpler solution. No annotation is required. Three graphs are generated by Origin.

proc sort data=sashelp.cars out=cars;

by origin;

run;

proc sgplot data=cars nocycleattrs noautolegend;

by origin;

vbox mpg_city / category=type connect=q1;

vbox mpg_city / category=type nofill connect=q3;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-19-2016 04:45 AM

Well, Sanjay is the best man to answer that one as he's the graph master. From my side, I would split the graph up. I would first do the box plot as you have. Then for each other line you require, do a series plot for that data, overlaying each result:

proc sgplot data=residrank sganno=sganno noautolegend; vbox student / category=rank_linearpred; series x=... y=...; by resp_name; run;

You should have the data elements in your data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-19-2016 04:15 PM

Can you share the data sets (main and anno) you have generated for this program? That will save us some time. Also, which release of SAS are you using?

Solution

01-19-2016
08:11 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

01-19-2016 05:12 PM - edited 01-19-2016 05:12 PM

Maybe there is a simpler solution. No annotation is required. Three graphs are generated by Origin.

proc sort data=sashelp.cars out=cars;

by origin;

run;

proc sgplot data=cars nocycleattrs noautolegend;

by origin;

vbox mpg_city / category=type connect=q1;

vbox mpg_city / category=type nofill connect=q3;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

01-19-2016 08:14 PM

Great! Thanks for the solution!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-20-2016 01:55 PM - edited 01-20-2016 01:55 PM

See new article in Graphically Speaking: http://blogs.sas.com/content/graphicallyspeaking/2016/01/20/easy-box-plot-with-connect/

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

01-20-2016 08:48 PM

Hi Sanjay,

I found this graph on the web and this is actually what I was aiming for (smoothed Q1 and Q3 done by lowess as connect lines). In this case, I believe there's no recourse but to use annotate, is that correct? And how to do that for multiple response variables?

Would truly appreciate your help on this one.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-20-2016 08:49 PM

By the way, the graph was done on R.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-20-2016 09:30 PM - edited 01-20-2016 09:35 PM

Which release of SAS are you using? That can make a big difference. With 9.4M3 you can use the SPLINE statement. Else you would need to compute the spline cureve yourself using Annotate. Can be done, but more work.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

01-20-2016 09:49 PM

I'm using SAS 9.3 :-(

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-21-2016 12:06 PM - edited 01-21-2016 02:14 PM

GTL SERIESPLOT supports SMOOTHCONNECT. This is a poor cousin of SPLINE, but may work for you. I will write up a blog post on how to do this. My simple use case works without BY variable. Will need work for BY variable support. Also, SAS 9.4 output is nicer since it uses SUBPIXEL. SAS 9.3 does not use SUBPIXEL, so the curve is not as nice.

I changed the data so x axis is numeric. Connect makes more sense with numeric x axis. Note however, that the smooth connect line overshoots the value when the curvature is sharp, as seen at Week 2.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JulieM

01-21-2016 05:54 AM

@JulieM Just because someone creates a graph does not mean it's a good idea. In the graph you present, it looks like the X axis is actually continuous, but was binned to create categorical varlues for the box plots. The "connecting curves" look like an attempt to approximate quantile regression curves. You can learn more about the plot you provided and why quantile regression is a better idea in the article "Quantile regression: Better than connecting the sample quantiles of binned data."

The QUANTREG procedure can create this kinds of plots automatically, and the curves that it creates are better estimates for the quantities that it looks like you are interested in.

Instead of telling us the graph you want to make, why not describe the data and the statistical questions you are trying to visualize? If quantile regression is the answer, there are many experts in the Statistical Procedures Support Community who can advise you with syntax. The article I mentioned previously has sample syntax.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

01-22-2016 02:53 AM - edited 01-22-2016 02:55 AM

Hi @Rick_SAS,

I'm actually exploring ways to visualize the presence/absence of heteroskedasticity of residuals in a linear model. It seemed a good idea to do that using boxplots of bins of equal counts across the range of the linear predictors (with smoothed connecting curves to emphasize the general overall pattern). I'll have a look at quantile regressions.

Thanks for the tip!