BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Rajaram
Obsidian | Level 7

I wanted to create an R equivalent SAS code for the below one. I am not an expert in R. I tried to understand especially y-axis calculation in SAS. How do we drive it or any function in SAS to get this value?

 

mpg %>% ggplot(aes(displ,fill = factor(cyl), color = factor(cyl))) +
geom_histogram(aes(y = 100*after_stat(width*density)),
binwidth = 2.5,
position = position_dodge())
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Caveat: I don't speak R, so I have no real understanding of that code.

 

I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:

 

 

proc sgplot data=yourdatasetname;
histogram variable;
run;

And see how well that works for what you want.

 

When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent;
run;

SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        nbins=100
;
run;

If you want each bin to span 2.5 units  (of what ever is measured)

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        binwidth=2.5
;
run;

Look in the online help for other options and experiment.

 

If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.

View solution in original post

8 REPLIES 8
DanH_sas
SAS Super FREQ

Can you run this and post a picture?

Rajaram
Obsidian | Level 7

Rajaram_0-1724928059658.png

Attached is the output from R. I have also attached data from R, my manual calculation, and the Excel output.

PaigeMiller
Diamond | Level 26

Most of us will not download Excel files, as they are a security threat. We need to have data provided as working SAS data step code (examples and instructions).

 

Also, what is the Y axis? Is it really "percent"? Or is it something else?

--
Paige Miller
ballardw
Super User

Caveat: I don't speak R, so I have no real understanding of that code.

 

I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:

 

 

proc sgplot data=yourdatasetname;
histogram variable;
run;

And see how well that works for what you want.

 

When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent;
run;

SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        nbins=100
;
run;

If you want each bin to span 2.5 units  (of what ever is measured)

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        binwidth=2.5
;
run;

Look in the online help for other options and experiment.

 

If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.

Rajaram
Obsidian | Level 7

I have attached SAS dataset and R output in one of the replies.

PaigeMiller
Diamond | Level 26

I am unable to download your SAS data set because of firewall restrictions here at work. So, repeating: We need to have data provided as working SAS data step code (examples and instructions).

 

Please answer my question about the Y-axis.

--
Paige Miller
ballardw
Super User

@Rajaram wrote:

I have attached SAS dataset and R output in one of the replies.


Your data set has two variables, dspl and trt. Dspl has 35 different values. So I think something has been skipped in providing how you reduce 35 levels to apparently 3 for the plot (and to tell the truth that x axis on the plot looks pretty odd given the way the values are displayed).

 

If you are grouping Dspl by ranges of values that is typically done with a format in SAS.

Your description didn't include how to calculate what appears to be a percentage so I take what I think you wanted and use Proc Freq as one of the basic tools for counting and percentage calculations.

proc format library=work;
value dspl
low - 3 = '[1.6, 3]'
3 <-  5 = '(3, 5]'
5 <-high= '>5'
;
run;
proc freq data= tmp1.tmpg noprint;
   tables trt*dspl/outpct out=work.summary;
   format dspl dspl. ;
run;


proc sgplot data=work.summary;
   vbar dspl  / group=trt  groupdisplay=cluster
                response=pct_row
   ;
   label dspl='Displacement'
         trt='Treatment'
         pct_row='Percent of Treatment'
   ;
run;

Hopefully the ranges of the Dspl format in the Proc Format are easy enough to adjust to your need as your plot from R leaves a greate deal of missing details as to what was done. I display the ranges to delineate the actual ranges of the values. For those who don't remember their math courses the [ or ] indicates the value is included in the range, ( or ) the value is not included in the range but is as close as you want (3, 5] means >3 and less than or equal to 5 for example.

 

Note the use of LABEL statement to place meaningful text in the Axis and legend.

 

Unless everyone seeing this graph will immediately understand what value(s) TRT1 indicates I would also create a format to display meaningful text instead of trt1, trt2 etc. in the legend.

Rajaram
Obsidian | Level 7

Thank you. I changed format it matched my excel output.

 

proc format library=work;
value dspl
low - 3.25 = '2.5'
3.25 <-  6.25 = '5'
6.25 <-high= '7.5'
;
run;

proc freq data= data.tmpg noprint;
   tables trt*dspl/outpct out=work.summary;
   format dspl dspl. ;
run;


proc sgplot data=work.summary;
   vbar dspl  / group=trt  groupdisplay=cluster
                response=pct_row
   ;
   label dspl='Displacement'
         trt='Treatment'
         pct_row='Percent of Treatment'
   ;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2544 views
  • 4 likes
  • 4 in conversation