BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SASuserlot
Barite | Level 11

 I am trying to achieve the Histogram like in the image ( sorry for my artistic abilities). Is it possible? Thank you for your input. In the following data, I want the count/ percentage based on the 'SEPALLENGTH' on the Y axis and X- the axis should have the  'cat.' (  1  in image) If not, then the customized scale ( like how it was done in the 'Univariate" code). with the histogram filled ( labeled with 'CAT') (2 in image). thank you for your help and suggestions.

SASuserlot_0-1687639858575.png

data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";


run;

** graphs by univariate**;

proc univariate data=iris;
class species;
   histogram sepallength / normal(color=blue)
                           ctext     = blue
                           midpoints = 40 to 80 by 5;
	INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ; 
run;
**************************;

**graphs by SGPLOT **;

proc means data=iris noprint;
class species;
var SepalLength;
output out=meanval mean=;
ways 1;
run;

data _null_;
set meanval;
if species = 'Setosa' then
call symput("SE_Mean", put(SepalLength, best6.));
if species = 'Versicolor' then
call symput("Ve_MEAN", put(SepalLength, best6.));
if species = 'Virginica' then
call symput("vi_MEAN", put(SepalLength, best6.));
run;

proc sgplot data=iris;
by species;
histogram SepalLength / group = species ;
inset (	"Setosa"="&SE_Mean" 
		"Versicolor"="&Ve_MEAN"
		'Virginica'= "&vi_MEAN") / border title=" Species";
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Since your hand-drawn histogram does not show a "fit curve" are you sure that you want a histogram?

The more you need to control width of bars then perhaps HISTOGRAM plot isn't really what you want.

 

Please consider this example, using VBAR and a FORMAT to control bar widths, and apply an axis label for the category.

The options barwidth=1 suppresses any space between adjacent categories.

The XAXIS values statement forces all of the xaxis to be the same so your graphs show the same range of values.

The Format statement then uses the format to 1) create groups of values of Sepallength variable and 2) label the axis.

proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;


data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";


run;

** graphs by univariate**;

proc univariate data=iris;
class species;
   histogram sepallength / normal(color=blue)
                           ctext     = blue
                           midpoints = 40 to 80 by 5;
	INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ; 
run;


proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

proc sgplot data=iris;
   by species;
   vbar sepallength /group=species stat=percent barwidth=1;
   format sepallength sepalcat.;
   xaxis  values=(42.5 to 77.5 by 5);
run;

You will learn that creating character valued variables can create problems with graphing and reporting as often the default displays will be in formatted value order and not match the underlying numeric values causing some confusion or occasionally convoluted code to get the natural intended order to display.

 

BTW, you do realize that your univariate scale of midpoint 40 to 80 by 5 does not match the categories you created, don't you? Your category is using 42.5, 47.5 etc as midpoints, not 40, 45, 50 ....

 

 

View solution in original post

4 REPLIES 4
ballardw
Super User

Since your hand-drawn histogram does not show a "fit curve" are you sure that you want a histogram?

The more you need to control width of bars then perhaps HISTOGRAM plot isn't really what you want.

 

Please consider this example, using VBAR and a FORMAT to control bar widths, and apply an axis label for the category.

The options barwidth=1 suppresses any space between adjacent categories.

The XAXIS values statement forces all of the xaxis to be the same so your graphs show the same range of values.

The Format statement then uses the format to 1) create groups of values of Sepallength variable and 2) label the axis.

proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;


data iris;
set sashelp.iris;

if 		40<=SepalLength<45 then cat = "40<= SL <45";
else if	45<=SepalLength<50 then cat = "45<= SL <50";
else if	50<=SepalLength<55 then cat = "50<= SL <55";
else if	55<=SepalLength<60 then cat = "55<= SL <60";
else if	60<=SepalLength<65 then cat = "60<= SL <65";
else if	65<=SepalLength<70 then cat = "65<= SL <70";
else if	70<=SepalLength<75 then cat = "70<= SL <75";
else if sepallength >= 75 then cat = ">= 75";


run;

** graphs by univariate**;

proc univariate data=iris;
class species;
   histogram sepallength / normal(color=blue)
                           ctext     = blue
                           midpoints = 40 to 80 by 5;
	INSET N = 'Count' MEDIAN (8.2) MEAN (8.2) STD = 'Standard Deviation' (8.3)/ POSITION = ne; ; 
run;


proc format ;
value sepalcat
40-<45 = "40<= SL <45"
45-<50 = "45<= SL <50"
50-<55 = "50<= SL <55"
55-<60 = "55<= SL <60"
60-<65 = "60<= SL <65"
65-<70 = "65<= SL <70"
70-<75 = "70<= SL <75"
75-high = ">= 75"
;
run;

proc sgplot data=iris;
   by species;
   vbar sepallength /group=species stat=percent barwidth=1;
   format sepallength sepalcat.;
   xaxis  values=(42.5 to 77.5 by 5);
run;

You will learn that creating character valued variables can create problems with graphing and reporting as often the default displays will be in formatted value order and not match the underlying numeric values causing some confusion or occasionally convoluted code to get the natural intended order to display.

 

BTW, you do realize that your univariate scale of midpoint 40 to 80 by 5 does not match the categories you created, don't you? Your category is using 42.5, 47.5 etc as midpoints, not 40, 45, 50 ....

 

 

SASuserlot
Barite | Level 11

Thank you @ballardw 

1. I realized the univariate scale once I posted the question. Saw the note in the log

2. My requirement was to display the categorical how the range of values we gave on the x-axis label.  For now, I think it will be ok. Thank you for the code.

3. Can the fit curve be generated to show over the bars, like how we get the fit curve with univariates?

4.  I tried to use the 'fitpolicy' option, but I am getting an angle  ~120 angles. But I want to get the opposite direction/angle ~240.  I tried to rotate the option but did not work. Is it possible?

SASuserlot
Barite | Level 11

Thank you @Ksharp. The articles helped me to achieve what I was looking for. Thank you, @ballardw . This post now answered.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1043 views
  • 3 likes
  • 3 in conversation