Programming the statistical procedures from SAS

Histograms - I can't get all the options I'd like

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Histograms - I can't get all the options I'd like

I am trying to generate histograms with three properties and I can't figure out how to do this.  It seems like each of proc univariate, sg plot, and gchart will give me two of the three properties, but none of them will give me all three.

I'd like to generate a histogram such that: 1) I can control the number of bins, 2) I can control which labels show up on the x-axis, 3) I can get a vertical reference line.

For example, I have data that ranges from 70 to 140 inclusive and I'd like:

1) Bins by twos (70, 72, 74, ..., 138, 140)

2) Labels by 10s (70, 80, ..., 130, 140) where it's important that 70 and 140 are displayed

3) A vertical reference line at 100.

SAS seems to insist on either letting me set the number of bins but then SAS determines the numbers that are displayed on the x-axis, or SAS will let me choose the labels but then either uses those for the bins or determines its own number of bins.

Suggestions?  Thank you in advance.


Accepted Solutions
Solution
‎09-22-2014 11:38 AM
New Contributor
Posts: 4

Re: Histograms - I can't get all the options I'd like

Okay, binstart and values are the separate commands I was looking for.  If one of them is missing, SAS is either using the command present for both roles or is making its own guess for the missing command.  In this code:

proc sgplot data=Timing;

    histogram Number_of_Items / BINSTART = 70 BINWIDTH=2 scale=count;

    yaxis label='Frequency' min=0 max=1350;

    xaxis label='Number of Items' min=70 max=140

        VALUES=(70 80 90 100 110 120 130 140);

    refline 100 /axis=x;

run;

BINSTART and BINWIDTH control the number of bins SAS uses in its calculations and therefore the number of bars that appear in the histogram.  VALUES controls what is printed for labels on the x-axis.  In my other code I thought VALUES controlled the number of bins and with the long list of numbers I had there, there wasn't room for the 140 to print.  With this smaller list of values, 70 and 140 both print.  BINSTART and VALUES work independently of each other.  Finally, refline gives me my reference line.  Then goptions or templates get used to adjust fonts and such issues.

Thank you Reeza for your help.  You narrowed things down so that my google searches finally found the correct terms that then gave me the correct commands. 

View solution in original post


All Replies
Super User
Posts: 18,586

Re: Histograms - I can't get all the options I'd like

SGPlot should allow you to do this, you may want to pre-calculate your data though

Can you show your code for SG Plot that doesn't work?

New Contributor
Posts: 4

Re: Histograms - I can't get all the options I'd like

Thank you for your response.  For example, this code:

proc sgplot data=Timing;

   histogram Number_of_Items / scale=count;

   xaxis values=(70 72 74 76 78 80 82 84 86 88 90 92 94 96 98

                 100 102 104 106 108 110 112 114 116 118

                 120 122 124 126 128 130 132 134 136 138

                 140);

    yaxis label='Frequency' min=0 max=1350;

    xaxis label='Number of Items' min=70 max=140;

run;

It will give me the bins I specified, but the labels on the x-axis are 80, 100, 120 and 140.  This seems like a minor issue, and it is, but I need to exactly reproduce an existing report and so need the x-axis scale to read 70, 80, ..., 140.

It hadn't occurred to me to pre-calculate data.  If I make my own frequency counts, will SAS cooperate better graphing those?  Is there a term I can google or a link to code?

Thank you again.

Super User
Posts: 18,586

Re: Histograms - I can't get all the options I'd like

I think the problem is simpler, you have 2 xaxis statements and the last one overwrites the first one. Combining them together produces the labels that you'd like.

Look at the refline statement to add a reference line.

proc sgplot data=Timing;

   histogram Number_of_Items / scale=count;

    yaxis label='Frequency' min=0 max=1350;

    xaxis label='Number of Items' min=70 max=140 values=(70 72 74 76 78 80 82 84 86 88 90 92 94 96 98

                 100 102 104 106 108 110 112 114 116 118

                 120 122 124 126 128 130 132 134 136 138

                 140);;

run;

New Contributor
Posts: 4

Re: Histograms - I can't get all the options I'd like

That's a good theory, and probably part of the problem, but when I run your code the values shown on the x-axis change from 80, 100, 120, 140 in my old code to the values 70 through 138 inclusive, by 4s (70, 74, 78, ..., 134, 138).  So it still refuses to let me say that I want the x-axis to have tick marks at 70, 80, ..., 130, 140.

It's not clear to me if there's an easy way to do this or if I have to resort to templates, which I'm just learning, or if there is some other method.  It just seems like there would be independent and easy commands to specify one set of numbers for the bins and a different set for the values displayed on the x-axis.

Solution
‎09-22-2014 11:38 AM
New Contributor
Posts: 4

Re: Histograms - I can't get all the options I'd like

Okay, binstart and values are the separate commands I was looking for.  If one of them is missing, SAS is either using the command present for both roles or is making its own guess for the missing command.  In this code:

proc sgplot data=Timing;

    histogram Number_of_Items / BINSTART = 70 BINWIDTH=2 scale=count;

    yaxis label='Frequency' min=0 max=1350;

    xaxis label='Number of Items' min=70 max=140

        VALUES=(70 80 90 100 110 120 130 140);

    refline 100 /axis=x;

run;

BINSTART and BINWIDTH control the number of bins SAS uses in its calculations and therefore the number of bars that appear in the histogram.  VALUES controls what is printed for labels on the x-axis.  In my other code I thought VALUES controlled the number of bins and with the long list of numbers I had there, there wasn't room for the 140 to print.  With this smaller list of values, 70 and 140 both print.  BINSTART and VALUES work independently of each other.  Finally, refline gives me my reference line.  Then goptions or templates get used to adjust fonts and such issues.

Thank you Reeza for your help.  You narrowed things down so that my google searches finally found the correct terms that then gave me the correct commands. 

SAS Super FREQ
Posts: 3,548

Re: Histograms - I can't get all the options I'd like

You might find this overview helpful: Choosing bins for histograms in SAS - The DO Loop

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 378 views
  • 7 likes
  • 3 in conversation