turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Histograms - I can't get all the options I'd like

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-22-2014 08:44 AM

I am trying to generate histograms with three properties and I can't figure out how to do this. It seems like each of proc univariate, sg plot, and gchart will give me two of the three properties, but none of them will give me all three.

I'd like to generate a histogram such that: 1) I can control the number of bins, 2) I can control which labels show up on the x-axis, 3) I can get a vertical reference line.

For example, I have data that ranges from 70 to 140 inclusive and I'd like:

1) Bins by twos (70, 72, 74, ..., 138, 140)

2) Labels by 10s (70, 80, ..., 130, 140) where it's important that 70 and 140 are displayed

3) A vertical reference line at 100.

SAS seems to insist on either letting me set the number of bins but then SAS determines the numbers that are displayed on the x-axis, or SAS will let me choose the labels but then either uses those for the bins or determines its own number of bins.

Suggestions? Thank you in advance.

Accepted Solutions

Solution

09-22-2014
11:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidDew

09-22-2014 11:38 AM

Okay, binstart and values are the separate commands I was looking for. If one of them is missing, SAS is either using the command present for both roles or is making its own guess for the missing command. In this code:

proc sgplot data=Timing;

histogram Number_of_Items / BINSTART = 70 BINWIDTH=2 scale=count;

yaxis label='Frequency' min=0 max=1350;

xaxis label='Number of Items' min=70 max=140

VALUES=(70 80 90 100 110 120 130 140);

refline 100 /axis=x;

run;

BINSTART and BINWIDTH control the number of bins SAS uses in its calculations and therefore the number of bars that appear in the histogram. VALUES controls what is printed for labels on the x-axis. In my other code I thought VALUES controlled the number of bins and with the long list of numbers I had there, there wasn't room for the 140 to print. With this smaller list of values, 70 and 140 both print. BINSTART and VALUES work independently of each other. Finally, refline gives me my reference line. Then goptions or templates get used to adjust fonts and such issues.

Thank you Reeza for your help. You narrowed things down so that my google searches finally found the correct terms that then gave me the correct commands.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidDew

09-22-2014 10:36 AM

SGPlot should allow you to do this, you may want to pre-calculate your data though

Can you show your code for SG Plot that doesn't work?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

09-22-2014 11:01 AM

Thank you for your response. For example, this code:

proc sgplot data=Timing;

histogram Number_of_Items / scale=count;

xaxis values=(70 72 74 76 78 80 82 84 86 88 90 92 94 96 98

100 102 104 106 108 110 112 114 116 118

120 122 124 126 128 130 132 134 136 138

140);

yaxis label='Frequency' min=0 max=1350;

xaxis label='Number of Items' min=70 max=140;

run;

It will give me the bins I specified, but the labels on the x-axis are 80, 100, 120 and 140. This seems like a minor issue, and it is, but I need to exactly reproduce an existing report and so need the x-axis scale to read 70, 80, ..., 140.

It hadn't occurred to me to pre-calculate data. If I make my own frequency counts, will SAS cooperate better graphing those? Is there a term I can google or a link to code?

Thank you again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidDew

09-22-2014 11:10 AM

I think the problem is simpler, you have 2 xaxis statements and the last one overwrites the first one. Combining them together produces the labels that you'd like.

Look at the refline statement to add a reference line.

proc sgplot data=Timing;

histogram Number_of_Items / scale=count;

yaxis label='Frequency' min=0 max=1350;

xaxis label='Number of Items' min=70 max=140 values=(70 72 74 76 78 80 82 84 86 88 90 92 94 96 98

100 102 104 106 108 110 112 114 116 118

120 122 124 126 128 130 132 134 136 138

140);;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

09-22-2014 11:21 AM

That's a good theory, and probably part of the problem, but when I run your code the values shown on the x-axis change from 80, 100, 120, 140 in my old code to the values 70 through 138 inclusive, by 4s (70, 74, 78, ..., 134, 138). So it still refuses to let me say that I want the x-axis to have tick marks at 70, 80, ..., 130, 140.

It's not clear to me if there's an easy way to do this or if I have to resort to templates, which I'm just learning, or if there is some other method. It just seems like there would be independent and easy commands to specify one set of numbers for the bins and a different set for the values displayed on the x-axis.

Solution

09-22-2014
11:38 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidDew

09-22-2014 11:38 AM

Okay, binstart and values are the separate commands I was looking for. If one of them is missing, SAS is either using the command present for both roles or is making its own guess for the missing command. In this code:

proc sgplot data=Timing;

histogram Number_of_Items / BINSTART = 70 BINWIDTH=2 scale=count;

yaxis label='Frequency' min=0 max=1350;

xaxis label='Number of Items' min=70 max=140

VALUES=(70 80 90 100 110 120 130 140);

refline 100 /axis=x;

run;

BINSTART and BINWIDTH control the number of bins SAS uses in its calculations and therefore the number of bars that appear in the histogram. VALUES controls what is printed for labels on the x-axis. In my other code I thought VALUES controlled the number of bins and with the long list of numbers I had there, there wasn't room for the 140 to print. With this smaller list of values, 70 and 140 both print. BINSTART and VALUES work independently of each other. Finally, refline gives me my reference line. Then goptions or templates get used to adjust fonts and such issues.

Thank you Reeza for your help. You narrowed things down so that my google searches finally found the correct terms that then gave me the correct commands.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DavidDew

09-22-2014 01:46 PM

You might find this overview helpful: Choosing bins for histograms in SAS - The DO Loop