## Customizing the Histogram with SGPLOT

I am trying to achieve the Histogram like in the image ( sorry for my artistic abilities). Is it possible? Thank you for your input. In the following data, I want the count/ percentage based on the 'SEPALLENGTH' on the Y axis and X- the axis should have the  'cat.' (  1  in image) If not, then the customized scale ( like how it was done in the 'Univariate" code). with the histogram filled ( labeled with 'CAT') (2 in image). thank you for your help and suggestions. ``````data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";

run;

** graphs by univariate**;

proc univariate data=iris;
class species;
histogram sepallength / normal(color=blue)
ctext     = blue
midpoints = 40 to 80 by 5;
INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ;
run;
**************************;

**graphs by SGPLOT **;

proc means data=iris noprint;
class species;
var SepalLength;
output out=meanval mean=;
ways 1;
run;

data _null_;
set meanval;
if species = 'Setosa' then
call symput("SE_Mean", put(SepalLength, best6.));
if species = 'Versicolor' then
call symput("Ve_MEAN", put(SepalLength, best6.));
if species = 'Virginica' then
call symput("vi_MEAN", put(SepalLength, best6.));
run;

proc sgplot data=iris;
by species;
histogram SepalLength / group = species ;
inset (	"Setosa"="&SE_Mean"
"Versicolor"="&Ve_MEAN"
'Virginica'= "&vi_MEAN") / border title=" Species";
run;``````

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Customizing the Histogram with SGPLOT

Since your hand-drawn histogram does not show a "fit curve" are you sure that you want a histogram?

The more you need to control width of bars then perhaps HISTOGRAM plot isn't really what you want.

Please consider this example, using VBAR and a FORMAT to control bar widths, and apply an axis label for the category.

The options barwidth=1 suppresses any space between adjacent categories.

The XAXIS values statement forces all of the xaxis to be the same so your graphs show the same range of values.

The Format statement then uses the format to 1) create groups of values of Sepallength variable and 2) label the axis.

```proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";

run;

** graphs by univariate**;

proc univariate data=iris;
class species;
histogram sepallength / normal(color=blue)
ctext     = blue
midpoints = 40 to 80 by 5;
INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ;
run;

proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

proc sgplot data=iris;
by species;
vbar sepallength /group=species stat=percent barwidth=1;
format sepallength sepalcat.;
xaxis  values=(42.5 to 77.5 by 5);
run;```

You will learn that creating character valued variables can create problems with graphing and reporting as often the default displays will be in formatted value order and not match the underlying numeric values causing some confusion or occasionally convoluted code to get the natural intended order to display.

BTW, you do realize that your univariate scale of midpoint 40 to 80 by 5 does not match the categories you created, don't you? Your category is using 42.5, 47.5 etc as midpoints, not 40, 45, 50 ....

4 REPLIES 4

## Re: Customizing the Histogram with SGPLOT

Since your hand-drawn histogram does not show a "fit curve" are you sure that you want a histogram?

The more you need to control width of bars then perhaps HISTOGRAM plot isn't really what you want.

Please consider this example, using VBAR and a FORMAT to control bar widths, and apply an axis label for the category.

The options barwidth=1 suppresses any space between adjacent categories.

The XAXIS values statement forces all of the xaxis to be the same so your graphs show the same range of values.

The Format statement then uses the format to 1) create groups of values of Sepallength variable and 2) label the axis.

```proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";

run;

** graphs by univariate**;

proc univariate data=iris;
class species;
histogram sepallength / normal(color=blue)
ctext     = blue
midpoints = 40 to 80 by 5;
INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ;
run;

proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

proc sgplot data=iris;
by species;
vbar sepallength /group=species stat=percent barwidth=1;
format sepallength sepalcat.;
xaxis  values=(42.5 to 77.5 by 5);
run;```

You will learn that creating character valued variables can create problems with graphing and reporting as often the default displays will be in formatted value order and not match the underlying numeric values causing some confusion or occasionally convoluted code to get the natural intended order to display.

BTW, you do realize that your univariate scale of midpoint 40 to 80 by 5 does not match the categories you created, don't you? Your category is using 42.5, 47.5 etc as midpoints, not 40, 45, 50 ....

## Re: Customizing the Histogram with SGPLOT

Thank you @ballardw

1. I realized the univariate scale once I posted the question. Saw the note in the log

2. My requirement was to display the categorical how the range of values we gave on the x-axis label.  For now, I think it will be ok. Thank you for the code.

3. Can the fit curve be generated to show over the bars, like how we get the fit curve with univariates?

4.  I tried to use the 'fitpolicy' option, but I am getting an angle  ~120 angles. But I want to get the opposite direction/angle ~240.  I tried to rotate the option but did not work. Is it possible?  Ksharp
Super User

## Re: Customizing the Histogram with SGPLOT

Thank you @Ksharp. The articles helped me to achieve what I was looking for. Thank you, @ballardw . This post now answered.

Discussion stats
• 4 replies
• 262 views
• 3 likes
• 3 in conversation