turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS/GRAPH and ODS Graphics
- /
- A simple histogram - how hard can it be?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 07:21 AM

I have been tearing my hair out with frustration for the last hour. I just want a histogram for my age distribution, in bins of 1 bar per age category (age in years, no decimals in the data), showing values 0, 5, 10 etc in the x-axis.

I am working from Enterprise Guide.

I cannot believe how many different procs there are to make histograms, each with different syntax. There is the

proc univairate data=mydate; histogram age; run;

or

proc gchart data=mydata; vbar age /discrete; run;

or

PROC SGPLOT DATA = mydata; HISTOGRAM age/binwidth=1 binstart=0; TITLE "Age"; xaxis values=(0 to 100 by 1); RUN;

etc.

The closest I have come is with the SGPLOT, but the first bar is halvway hidden behind the Y axis. Does anyone know how to make the first bar visible?

Also, which command would you use to generate a simple histogram like this? Is there something like a best practice? Are some of these older procs getting faded out?

Accepted Solutions

Solution

02-22-2017
08:07 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 07:52 AM - edited 02-22-2017 07:53 AM

You're correct that there are many ways to create a histogram -- you touched only on a few of them!

I think the simplest method is the one you tried last: PROC SGPLOT. Since you have 100 bins and the graph is only so wide, the algorithm to make everything fit might make your left-most extreme value seem very tight against the axis. You can use the OFFSETMIN= option to give a little more space. Try this:

```
data sample (keep=age);
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```

All Replies

Solution

02-22-2017
08:07 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 07:52 AM - edited 02-22-2017 07:53 AM

You're correct that there are many ways to create a histogram -- you touched only on a few of them!

I think the simplest method is the one you tried last: PROC SGPLOT. Since you have 100 bins and the graph is only so wide, the algorithm to make everything fit might make your left-most extreme value seem very tight against the axis. You can use the OFFSETMIN= option to give a little more space. Try this:

```
data sample (keep=age);
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 08:14 AM

Thank you! Don't think I would have found that sollution on my own.

Is there a simple way to split this chart by say gender? So that I get a stacked histogram?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 09:18 AM

Of course! Again, multple methods, but now it sounds like you're more interested in a VBAR with a grouping variable than a classic histogram of statistical distribution.

You could try PROC SGPANEL with a HISTOGRAM statement and PANELBY for gender (that would yield two histograms). Or you could use PROC FREQ to calc the percentages into a data set, then use a step like:

```
proc sgplot data=freq_output;
vbar age / response=percent group=gender grouporder=data;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2017 09:57 AM

Building on Chris' example, here are three variations you do with your grouping variable

```
proc format;
value gender 1="Male"
2="Female"
;
run;
data sample (keep=age g);
do g = 1 to 2;
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
format g gender.;
by g;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
proc sgplot data=sample;
format g gender.;
HISTOGRAM age / binwidth=1 binstart=0 group=g transparency=0.5;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
proc sgpanel data=sample;
format g gender.;
panelby g / layout=rowlattice novarname;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
colaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```