Data visualization with SAS programming

how to set bin boundaries and display count in sgplot histogram

Reply
Occasional Contributor
Posts: 6

how to set bin boundaries and display count in sgplot histogram

Hi,

I am trying to create a distribution histogram that has bins that start on the grid lines (as opposed to the default where the grid lines are the midpoint of each bin). Is there a way to do this with the options?  Also, is there a way to have the histogram display the count for each individual bin? This is the code I have written so far:

proc sgplot data = dataset noautolegend;

     histogram variable /

            binstart = 60 binwidth = 5 scale =count;

     refline = 140 /

            axis = xlineattrs = (color = red thickness =2);

     xaxis label = 'abc' grid values = (60 to 220 by 5);

run;

The final product I am hoping for is a histogram that has an x axis that starts at 60 and has grid lines at every 5 units until it gets to 220. I want the bins to start and end on the grid lines and also show somewhere on the graph the number or count that is represented in each bin.

Any help/ info you can provide about this would be much appreciated!


Thanks!

SAS Super FREQ
Posts: 3,225

Re: how to set bin boundaries and display count in sgplot histogram

You can use the BINSTART= and BINWIDTH= options to control the bin anchor and width. Be sure to use SCALE=COUNT.   Use XAXIS VALUES=(60 to 220 by 5) to get the tick marks to agree.

Unfortunately, labeling the bar heights is not a built-in option for the HISTOGRAM statement.  I know of three options that you can choose from:

1) Use YAXIS GRID to add horizontal grid lines to the graph. That will make it easy to read the heights of the bars. This is what I would do, since grid lines are less distracting than bar labels.

2) Give up on PROC SGPLOT and use the Graph Template Language (GTL) instead. This enables you to overlay the data labels for the bars on top of the histogram.

3) Create the historgram by using PROC UNIVARIATE instead. See the example at Construct normal data from summary statistics - The DO Loop   For your example, the UNIVARIATE code (on my data) would be

proc univariate data=WalkTimes;

   freq Freq;

   var t;

   histogram t /endpoints=(12 to 24) vscale=count barlabel=count;

run;

Post a Question
Discussion Stats
  • 1 reply
  • 5548 views
  • 0 likes
  • 2 in conversation