Presently I have so many "groups" when doing Box Plots that the result is SEVEN panels of box plots.
I'd like to have ONE panel, with about 20 box plots (or "groups").
So, that would require cutting out a bunch of groups.
Is there a way to automatically do this?
What I have in mind is: In a data step, only keep the TOP 20 groups, using Q3 value for each group as the criterion for keeping or removing.
Any coding assistance greatly appreciated.
Nicholas Kormanik
Calculating Q3 in a datastep is going to be a lot of work. Proc means / summary and merging with your existing data is probably a better bet. Why does it need to be in a datastep?
I was thinking that it could be easily done in the data step, is all.
After reading up on the topic further, it now appears that perhaps the best answer is to do Proc BoxPlot with all groups, and include the option of creating OUTBOX or OUTHISTORY datafile.
Then in a subsequent run, use one of these new datafiles as the new input.
Still not sure, though, of the exact coding for keeping the top 20 groups.
See:
SAS/STAT(R) 9.2 User's Guide, Second Edition
proc boxplot history=Summary;
plot (Weight Yieldstrength) * Batch;
run;
Since another datafile is being used in this subsequent run (history), and that datafile contains a column for, say, Q3, one would use a new data statement to tailor the history data file -- sort on the Q3 column, descending order, and use OBS=20, so that only the top 20 'groups' will be included, and then used. Result should be ONE panel of box plots of the top 20 groups.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.