I am using the HISTOGRAM statement in SGPLOT to look at distribution (COUNT) of fish length (TL = total length in mm) grouped by year of release (YEAR). There appears to be a bug when using the GROUP option, as total counts for some of the bins are way off. For example, compare the total count for bin 450. When not grouped, the total count is 97, but when grouped by year, it appears to be just over 30. The bars are not stacking properly by year within bins. The summed count of the 450 bin for grouped data is 83. So it looks like bars are not stacking properly and not all groups are represented in each bin. Can anyone confirm this is a bug or tell me what I've overlooked? I am running SAS 9.4 (TS1M2) under Windows 7. The code is below and the CSV data file is attached. data WORK.RECP15 ;
infile 'th_r_2015.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat TAG $10. ;
informat TAG134 $10. ;
informat Date mmddyy10. ;
informat HIST $10. ;
informat STATUS $7. ;
informat LOCATION $50. ;
informat COLLECTOR $2. ;
informat SEX $1. ;
informat TL best32. ;
informat WT best32. ;
informat REARING $50. ;
informat HEALTH $1. ;
informat COMMENTS $80. ;
informat RearC $8.;
informat RelC $8.;
format TAG $10. ;
format TAG134 $10. ;
format Date mmddyy10. ;
format HIST $10. ;
format STATUS $7. ;
format LOCATION $50. ;
format COLLECTOR $2. ;
format SEX $1. ;
format TL best12. ;
format WT best12. ;
format REARING $50. ;
format HEALTH $1. ;
format COMMENTS $80. ;
format RearC $8.;
format RelC $8.;
input
TAG $
TAG134 $
Date
HIST $
STATUS $
LOCATION $
COLLECTOR $
SEX $
TL
WT
REARING $
HEALTH $
COMMENTS $
RearC $
RelC $
;
year=year(date);
yrsout=2015-year;
qtr=qtr(date);
month=month(date);
run;
*No groups, y-axis max is at ~100, count is almost 100 for bin of TL=450;
proc sgplot data=recp15;
histogram tl / scale=count showbins datalabel=count;
label tl="Release TL(MM)";
run;
*Group by year, count for TL=450 is < 100, y-axis not scaled correctly for total of stacked bar counts;
*Not all years show for TL=450;
proc sgplot data=recp15;
histogram tl / scale=count group=year binstart=270 binwidth=30 datalabel=count showbins;
keylegend / location=outside position=bottom sortorder=ascending title="Release Year" ;
xaxis values=(270 to 570 by 30) ;
label tl="Release TL(MM)";
run;
... View more