Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Graphics
- /
- A simple histogram - how hard can it be?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-22-2017 07:21 AM
(7752 views)

I have been tearing my hair out with frustration for the last hour. I just want a histogram for my age distribution, in bins of 1 bar per age category (age in years, no decimals in the data), showing values 0, 5, 10 etc in the x-axis.

I am working from Enterprise Guide.

I cannot believe how many different procs there are to make histograms, each with different syntax. There is the

proc univairate data=mydate; histogram age; run;

or

proc gchart data=mydata; vbar age /discrete; run;

or

PROC SGPLOT DATA = mydata; HISTOGRAM age/binwidth=1 binstart=0; TITLE "Age"; xaxis values=(0 to 100 by 1); RUN;

etc.

The closest I have come is with the SGPLOT, but the first bar is halvway hidden behind the Y axis. Does anyone know how to make the first bar visible?

Also, which command would you use to generate a simple histogram like this? Is there something like a best practice? Are some of these older procs getting faded out?

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You're correct that there are many ways to create a histogram -- you touched only on a few of them!

I think the simplest method is the one you tried last: PROC SGPLOT. Since you have 100 bins and the graph is only so wide, the algorithm to make everything fit might make your left-most extreme value seem very tight against the axis. You can use the OFFSETMIN= option to give a little more space. Try this:

```
data sample (keep=age);
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You're correct that there are many ways to create a histogram -- you touched only on a few of them!

I think the simplest method is the one you tried last: PROC SGPLOT. Since you have 100 bins and the graph is only so wide, the algorithm to make everything fit might make your left-most extreme value seem very tight against the axis. You can use the OFFSETMIN= option to give a little more space. Try this:

```
data sample (keep=age);
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you! Don't think I would have found that sollution on my own.

Is there a simple way to split this chart by say gender? So that I get a stacked histogram?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Of course! Again, multple methods, but now it sounds like you're more interested in a VBAR with a grouping variable than a classic histogram of statistical distribution.

You could try PROC SGPANEL with a HISTOGRAM statement and PANELBY for gender (that would yield two histograms). Or you could use PROC FREQ to calc the percentages into a data set, then use a step like:

```
proc sgplot data=freq_output;
vbar age / response=percent group=gender grouporder=data;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Building on Chris' example, here are three variations you do with your grouping variable

```
proc format;
value gender 1="Male"
2="Female"
;
run;
data sample (keep=age g);
do g = 1 to 2;
do i = 1 to 100000;
age = abs ( floor ( rand('triangle',0.1) * 100 ) );
output;
end;
end;
run;
ods graphics / width=1000px height=400px;
proc sgplot data=sample;
format g gender.;
by g;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
proc sgplot data=sample;
format g gender.;
HISTOGRAM age / binwidth=1 binstart=0 group=g transparency=0.5;
TITLE "Age";
xaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
proc sgpanel data=sample;
format g gender.;
panelby g / layout=rowlattice novarname;
HISTOGRAM age / binwidth=1 binstart=0 ;
TITLE "Age";
colaxis values=(0 to 100 by 1) offsetmin=.01 offsetmax=.01 ;
RUN;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.