Data visualization with SAS programming

boxplot help with summary stats and grouping

Reply
New Contributor
Posts: 3

boxplot help with summary stats and grouping

Hello,

I need to make a box plot showing groups of species (by family). An excerpt of data is below. I would like to add summary statistics for the entire dataset and also for each species. I would also like to limit my y-axis to 100. I've tried two codes that give me parts of what I am looking for but niether seems to do the trick. 

Any suggestions would be greatly appreciated!

Thanks

 

 

 

ods listing close;
ods html style=minimal gpath="I:\xxx\PRG" ;

ods graphics on /
imagename="Boxplot"
imagefmt=png
border=off;

 

*this code was creating using ods graphics design and uses a legend/color to group species by family, which is less preferred than the second code, but y-axis to 100 works, also no summary statistics in this version;

 

proc template;
define statgraph sgdesign;
dynamic _VERTICAL _SPECIES_NAME _FAMILY;
begingraph / designwidth=800 designheight=689;
   layout lattice / rowdatarange=data columndatarange=data rowgutter=10 columngutter=10;
      layout overlay / xaxisopts=( display=(TICKS TICKVALUES LINE ) discreteopts=( tickvaluefitpolicy=splitrotate)) yaxisopts=( label=('Inches') linearopts=( viewmin=0.0 viewmax=100.0)) ;
         boxplot x=_SPECIES_NAME y=_VERTICAL / group=_FAMILY name='box' groupdisplay=Overlay grouporder=ascending;
         discretelegend 'box' / opaque=false border=true halign=center valign=top displayclipped=true down=1 order=columnmajor location=inside ;
      endlayout;
   endlayout;
endgraph;
end;
run;

proc sgrender data=WORK.DI template=sgdesign ;
dynamic _VERTICAL="VERTICAL" _SPECIES_NAME="'SPECIES_NAME'n" _FAMILY="FAMILY";
run;

 

 

*this code groups species in families but apears at the top of the output as opposed to near the species names they are grouping, statistics are also generated, but stats by species are at the bottom and I'd prefer if they were at the top under the overal statistics, also axis is not limited to 100;

 

PROC boxplot data=di;
label vertical="Inches ";
plot vertical*species_name (family) /  haxis=axis1; *nohlabel;
inset min mean max stddev / header='Overall Statistics' position=tm;
   insetgroup min max mean/header='Statistics by Species' cframe=black cheader=black position=top;
run;

 

 

Species_name        vertical        family

calico                      10               aster

flat-topped              12               aster

rough-stemmed       17              goldenrod

NY                           22              aster

NE                           9                aster

grass-leaved           11              goldenrod

jack-in-th-pulpit        4               orchid

lady's tresses           6               orchid

purple fringed           8               orchid

yellow lady's slipper  9              orchid

black-eyed susan     11             daisy

Super User
Posts: 11,101

Re: boxplot help with summary stats and grouping

Did you consider Proc Boxplot with Inset and Insetgroup options?

 

Your use of VIEWMAX=100 on the yaxisopts is what is restricting the axis value to 100.

New Contributor
Posts: 3

Re: boxplot help with summary stats and grouping

Dan,

Thank you for your response. I ran your code but did not see any different results- my log has the following message :

'NOTE: Grouped box plot does not support the DISPLAYSTATS= option. The statistics will not be drawn.'

Thanks again for taking a look!

SAS Super FREQ
Posts: 925

Re: boxplot help with summary stats and grouping

[ Edited ]

I took your code and made a few changes:

1. I removed the LAYOUT LATTICE, as you do not need it for your case.

2. I move the sizing outside of the template.

3. I changed GROUPDISPLAY to be CLUSTER instead of OVERLAY. This will give you adjacent group boxes.

4. I added DISPLAYSTATS=STANDARD to the BOXPLOT statement. This will display your stats below the boxes. You can limit the stats displayed by using a parenthesized list of stats (e.g. DISPLAYSTATS=(Mean Median ... ).

 

Give this a try and see if it gives you what you want.

 

Thanks!
Dan

 

proc template;
define statgraph sgdesign;
dynamic _VERTICAL _SPECIES_NAME _FAMILY;
begingraph;
    layout overlay / xaxisopts=( display=(TICKS TICKVALUES LINE ) discreteopts=( tickvaluefitpolicy=splitrotate)) yaxisopts=( label=('Inches') linearopts=( viewmin=0.0 viewmax=100.0)) ;
       boxplot x=_SPECIES_NAME y=_VERTICAL / group=_FAMILY name='box' groupdisplay=cluster grouporder=ascending displaystats=standard;
       discretelegend 'box' / opaque=false border=true halign=center valign=top displayclipped=true down=1 order=columnmajor location=inside;      
endlayout; endgraph; end; run; ods graphics / width=800px; proc sgrender data=WORK.DI template=sgdesign ; dynamic _VERTICAL="VERTICAL" _SPECIES_NAME="'SPECIES_NAME'n" _FAMILY="FAMILY"; run;

 

New Contributor
Posts: 3

Re: boxplot help with summary stats and grouping

Ask a Question
Discussion stats
  • 4 replies
  • 297 views
  • 0 likes
  • 3 in conversation