BookmarkSubscribeRSS Feed
Ella
Calcite | Level 5
I need to create boxplots with only 5 number summaries. Is this possible? The full dataset is not available.
10 REPLIES 10
ChrisNZ
Tourmaline | Level 20
One way would be by modifying the great macro available here:

http://www.math.yorku.ca/SCS/sssg/boxplot.html

and replace the summarisation steps by your own data.
GraphGuy
Meteorite | Level 14
If you only have the summary statistics (rather than the raw data), you might want to use "proc boxplot" (which I believe is technically part of SAS/Stat rather than SAS/Graph - but I think most users usually have both these procs anyway...)

With proc boxplot, you can use the outhistory= option to create a data set containing the summary statistics for each bar, or use the history= option to use a data set containing summary statistics (rather than using data= with the raw data).

Creating the 'history' data sets is a little tricky (how to name the variables, and such) - I'd recommend running proc boxplot, and looking at the resulting 'outhistory' data set, and then using that as the starting point for your data set.

Here's a small sample I happened to have, which demonstrates how to use proc boxplot with a history= data set (hopefully the long data lines won't wrap badly):


data hubout;
input hub $ 1-14 revenuel revenue1 revenuex revenuem revenue3 revenueh revenuen;
datalines;
Frankfurt 10717.43 45549.06 558199.30 171813.74 433385.93 3460389.08 12
London 16267.52 48802.57 677813.43 257026.85 574650.23 3743156.90 12
New York 33491.96 60955.36 1116398.59 401903.49 762946.80 7449951.09 12
San Francisco 0.00 45214.14 717684.81 215736.05 653667.32 4340557.73 12
Sydney 4353.95 32654.66 518327.92 141814.52 577209.97 3537899.04 12
Tokyo 2392.28 25358.20 398713.78 108370.41 275112.51 2984133.44 12
;
run;

proc boxplot history=hubout;
plot revenue*hub;
run;
DougMoore
Calcite | Level 5

Apparently the following options do not work with 'History' (a summary data set):

  .Boxstyle = Schmatic

  .Notches

I desperately need both options to work with my summary data.

ballardw
Super User

If your data contains values and counts of values this is more likely to be possible. Or if you have the mean, limits of the IQR and the limits for the whiskers it may be possible with annotate data sets. What kind of summarized data do you actually have? What do the 5 value represent? And what should the final plots look like?

DougMoore
Calcite | Level 5

YEARMONTH,SPECIES_NAME,ResultsL,Results1,ResultsX,ResultsM,Results3,ResultsH,ResultsS,ResultsN

2013M03,Canine,-9.611106,99.35926,121.68813148,109.1859,120.70855,6174,74.522753418,354763

2013M04,Canine,-5.481052,99.89804,121.52345233,109.61,120.9135,6174,73.239377704,368549

2013M05,Canine,-5.886485,99.81166,120.93962437,109.36215,120.381525,1936.171,71.911495255,371648

2013M06,Canine,-6.085324,98.0872925,118.65344702,107.7889,118.730175,6860,72.178933135,351186

2013M07,Canine,-7.006166,97.089375,117.47239036,106.87495,117.777525,2690.947,70.147333935,373472

2013M08,Canine,-7.167602,96.82858,117.10706451,106.46835,117.3959,2851.924,68.368883962,364028

2013M09,Canine,-6.98099,96.44617,116.65658762,106.0149,117.0134,2744,68.086254128,344329

2013M10,Canine,-5.830464,96.92248,117.77359767,106.4043,117.4502,2293.712,70.114474393,374365

2013M11,Canine,-5.831106,98.31649,120.40123249,108.0578,119.5164,6860,73.428638245,347121

2013M12,Canine,-5.153101,100.018675,123.60781797,109.91825,121.811225,2744,76.459736732,344548

2014M01,Canine,-5.234818,101.2791,124.0452891,110.984,122.4557,1999.992,74.376290111,390969

2014M02,Canine,-1,101.7938,123.92318985,111.3511,122.5696,7546,74.844513208,385691

DougMoore
Calcite | Level 5

The Box Plot should look like the following:

See "Just Enough SAS", "A Quick-Start Guide to SAS for Engineers", by Robert A. Rutledge,

Copyright 2009, SAS Institute, Inc. Figure 5.2, Page nbr 117.

This is a notched box plot.  The only difference would be I need the Schematic box plot instead of the Skeletal (shown in Figure 5.2).

DougMoore
Calcite | Level 5

I am using the following procedure:

SAS/STAT(R) 9.22 User's Guide

The only difference is I need a Schematic type and I need it to be noched so I can perform a visual comparison of the medians.

Reeza
Super User

The notched part works fine, the boxstyle option does not, but I can't visually see the difference from the example.

So perhaps post your code that isn't working, IN A NEW THREAD. Preferably linked back to this one. 

DougMoore
Calcite | Level 5

I have learned that, with the standard 'History=' summary, one can't create a 'Schematic' style box plot because the Schematic style needs outlier data (not in the summary).

However with a 'BOX= input data (instead of 'History= input data') you can get both notched
and schematic. The BOX= data set must contain more data than the simpler
HISTORY= data set, though. It is not just a matter of reshaping from HISTORY=
to BOX=. If you only have the summary data that is found in a HISTORY= data set
then there is nothing you can do. You must have the values of the low and high
whiskers. These are the smallest and largest values in the data set that fall
within the fences. In addition you must have all the individual data points that
fall beyond the whiskers. These are the additional components of a schematic
box plot, and these are not typical summary statistics.

I think I am all set for now.  Thank you for quick and thorough response.

Reeza
Super User

You're not the original OP, its better if you start a new thread.

Can you explain your issue in detail please. For example, you have a 7 not 5 numbers summary.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3238 views
  • 0 likes
  • 6 in conversation