03-15-2013 01:01 PM
I am trying to create a distribution histogram that has bins that start on the grid lines (as opposed to the default where the grid lines are the midpoint of each bin). Is there a way to do this with the options? Also, is there a way to have the histogram display the count for each individual bin? This is the code I have written so far:
proc sgplot data = dataset noautolegend;
histogram variable /
binstart = 60 binwidth = 5 scale =count;
refline = 140 /
axis = xlineattrs = (color = red thickness =2);
xaxis label = 'abc' grid values = (60 to 220 by 5);
The final product I am hoping for is a histogram that has an x axis that starts at 60 and has grid lines at every 5 units until it gets to 220. I want the bins to start and end on the grid lines and also show somewhere on the graph the number or count that is represented in each bin.
Any help/ info you can provide about this would be much appreciated!
03-15-2013 01:40 PM
You can use the BINSTART= and BINWIDTH= options to control the bin anchor and width. Be sure to use SCALE=COUNT. Use XAXIS VALUES=(60 to 220 by 5) to get the tick marks to agree.
Unfortunately, labeling the bar heights is not a built-in option for the HISTOGRAM statement. I know of three options that you can choose from:
1) Use YAXIS GRID to add horizontal grid lines to the graph. That will make it easy to read the heights of the bars. This is what I would do, since grid lines are less distracting than bar labels.
2) Give up on PROC SGPLOT and use the Graph Template Language (GTL) instead. This enables you to overlay the data labels for the bars on top of the histogram.
3) Create the historgram by using PROC UNIVARIATE instead. See the example at Construct normal data from summary statistics - The DO Loop For your example, the UNIVARIATE code (on my data) would be
proc univariate data=WalkTimes;
histogram t /endpoints=(12 to 24) vscale=count barlabel=count;