I wanted to create an R equivalent SAS code for the below one. I am not an expert in R. I tried to understand especially y-axis calculation in SAS. How do we drive it or any function in SAS to get this value?
mpg %>% ggplot(aes(displ,fill = factor(cyl), color = factor(cyl))) + geom_histogram(aes(y = 100*after_stat(width*density)), binwidth = 2.5, position = position_dodge())
Caveat: I don't speak R, so I have no real understanding of that code.
I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:
proc sgplot data=yourdatasetname; histogram variable; run;
And see how well that works for what you want.
When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:
proc sgplot data=yourdatasetname; histogram variable / scale=percent; run;
SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins
proc sgplot data=yourdatasetname; histogram variable / scale=percent nbins=100 ; run;
If you want each bin to span 2.5 units (of what ever is measured)
proc sgplot data=yourdatasetname; histogram variable / scale=percent binwidth=2.5 ; run;
Look in the online help for other options and experiment.
If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.
Can you run this and post a picture?
Attached is the output from R. I have also attached data from R, my manual calculation, and the Excel output.
Most of us will not download Excel files, as they are a security threat. We need to have data provided as working SAS data step code (examples and instructions).
Also, what is the Y axis? Is it really "percent"? Or is it something else?
Caveat: I don't speak R, so I have no real understanding of that code.
I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:
proc sgplot data=yourdatasetname; histogram variable; run;
And see how well that works for what you want.
When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:
proc sgplot data=yourdatasetname; histogram variable / scale=percent; run;
SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins
proc sgplot data=yourdatasetname; histogram variable / scale=percent nbins=100 ; run;
If you want each bin to span 2.5 units (of what ever is measured)
proc sgplot data=yourdatasetname; histogram variable / scale=percent binwidth=2.5 ; run;
Look in the online help for other options and experiment.
If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.
I have attached SAS dataset and R output in one of the replies.
I am unable to download your SAS data set because of firewall restrictions here at work. So, repeating: We need to have data provided as working SAS data step code (examples and instructions).
Please answer my question about the Y-axis.
@Rajaram wrote:
I have attached SAS dataset and R output in one of the replies.
Your data set has two variables, dspl and trt. Dspl has 35 different values. So I think something has been skipped in providing how you reduce 35 levels to apparently 3 for the plot (and to tell the truth that x axis on the plot looks pretty odd given the way the values are displayed).
If you are grouping Dspl by ranges of values that is typically done with a format in SAS.
Your description didn't include how to calculate what appears to be a percentage so I take what I think you wanted and use Proc Freq as one of the basic tools for counting and percentage calculations.
proc format library=work; value dspl low - 3 = '[1.6, 3]' 3 <- 5 = '(3, 5]' 5 <-high= '>5' ; run; proc freq data= tmp1.tmpg noprint; tables trt*dspl/outpct out=work.summary; format dspl dspl. ; run; proc sgplot data=work.summary; vbar dspl / group=trt groupdisplay=cluster response=pct_row ; label dspl='Displacement' trt='Treatment' pct_row='Percent of Treatment' ; run;
Hopefully the ranges of the Dspl format in the Proc Format are easy enough to adjust to your need as your plot from R leaves a greate deal of missing details as to what was done. I display the ranges to delineate the actual ranges of the values. For those who don't remember their math courses the [ or ] indicates the value is included in the range, ( or ) the value is not included in the range but is as close as you want (3, 5] means >3 and less than or equal to 5 for example.
Note the use of LABEL statement to place meaningful text in the Axis and legend.
Unless everyone seeing this graph will immediately understand what value(s) TRT1 indicates I would also create a format to display meaningful text instead of trt1, trt2 etc. in the legend.
Thank you. I changed format it matched my excel output.
proc format library=work;
value dspl
low - 3.25 = '2.5'
3.25 <- 6.25 = '5'
6.25 <-high= '7.5'
;
run;
proc freq data= data.tmpg noprint;
tables trt*dspl/outpct out=work.summary;
format dspl dspl. ;
run;
proc sgplot data=work.summary;
vbar dspl / group=trt groupdisplay=cluster
response=pct_row
;
label dspl='Displacement'
trt='Treatment'
pct_row='Percent of Treatment'
;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.