BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Rajaram
Obsidian | Level 7

I wanted to create an R equivalent SAS code for the below one. I am not an expert in R. I tried to understand especially y-axis calculation in SAS. How do we drive it or any function in SAS to get this value?

 

mpg %>% ggplot(aes(displ,fill = factor(cyl), color = factor(cyl))) +
geom_histogram(aes(y = 100*after_stat(width*density)),
binwidth = 2.5,
position = position_dodge())
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Caveat: I don't speak R, so I have no real understanding of that code.

 

I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:

 

 

proc sgplot data=yourdatasetname;
histogram variable;
run;

And see how well that works for what you want.

 

When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent;
run;

SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        nbins=100
;
run;

If you want each bin to span 2.5 units  (of what ever is measured)

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        binwidth=2.5
;
run;

Look in the online help for other options and experiment.

 

If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.

View solution in original post

8 REPLIES 8
DanH_sas
SAS Super FREQ

Can you run this and post a picture?

Rajaram
Obsidian | Level 7

Rajaram_0-1724928059658.png

Attached is the output from R. I have also attached data from R, my manual calculation, and the Excel output.

PaigeMiller
Diamond | Level 26

Most of us will not download Excel files, as they are a security threat. We need to have data provided as working SAS data step code (examples and instructions).

 

Also, what is the Y axis? Is it really "percent"? Or is it something else?

--
Paige Miller
ballardw
Super User

Caveat: I don't speak R, so I have no real understanding of that code.

 

I have to assume that you have a SAS data set. So I would suggest first running Proc SGPLOT with all the defaults for a histogram:

 

 

proc sgplot data=yourdatasetname;
histogram variable;
run;

And see how well that works for what you want.

 

When I see something like "y = 100*after_stat" in code I suspect that Percentages may be involved and this converting decimal .3 to 30 percent. If that is the case then perhaps the option SCALE=PERCENT would be appropriate for the code:

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent;
run;

SAS has two basic options for setting the widths NBINS= or BIDWIDTH=. The first sets the number of bins, other sets a width based on the values of the variable to group together. If you want 100 bins

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        nbins=100
;
run;

If you want each bin to span 2.5 units  (of what ever is measured)

proc sgplot data=yourdatasetname;
histogram variable  /   scale=percent
                        binwidth=2.5
;
run;

Look in the online help for other options and experiment.

 

If you need help with options then be prepared to provide example data as a working data step and example code that got "close" to what you want. Then describe what is needed to modify the plot.

Rajaram
Obsidian | Level 7

I have attached SAS dataset and R output in one of the replies.

PaigeMiller
Diamond | Level 26

I am unable to download your SAS data set because of firewall restrictions here at work. So, repeating: We need to have data provided as working SAS data step code (examples and instructions).

 

Please answer my question about the Y-axis.

--
Paige Miller
ballardw
Super User

@Rajaram wrote:

I have attached SAS dataset and R output in one of the replies.


Your data set has two variables, dspl and trt. Dspl has 35 different values. So I think something has been skipped in providing how you reduce 35 levels to apparently 3 for the plot (and to tell the truth that x axis on the plot looks pretty odd given the way the values are displayed).

 

If you are grouping Dspl by ranges of values that is typically done with a format in SAS.

Your description didn't include how to calculate what appears to be a percentage so I take what I think you wanted and use Proc Freq as one of the basic tools for counting and percentage calculations.

proc format library=work;
value dspl
low - 3 = '[1.6, 3]'
3 <-  5 = '(3, 5]'
5 <-high= '>5'
;
run;
proc freq data= tmp1.tmpg noprint;
   tables trt*dspl/outpct out=work.summary;
   format dspl dspl. ;
run;


proc sgplot data=work.summary;
   vbar dspl  / group=trt  groupdisplay=cluster
                response=pct_row
   ;
   label dspl='Displacement'
         trt='Treatment'
         pct_row='Percent of Treatment'
   ;
run;

Hopefully the ranges of the Dspl format in the Proc Format are easy enough to adjust to your need as your plot from R leaves a greate deal of missing details as to what was done. I display the ranges to delineate the actual ranges of the values. For those who don't remember their math courses the [ or ] indicates the value is included in the range, ( or ) the value is not included in the range but is as close as you want (3, 5] means >3 and less than or equal to 5 for example.

 

Note the use of LABEL statement to place meaningful text in the Axis and legend.

 

Unless everyone seeing this graph will immediately understand what value(s) TRT1 indicates I would also create a format to display meaningful text instead of trt1, trt2 etc. in the legend.

Rajaram
Obsidian | Level 7

Thank you. I changed format it matched my excel output.

 

proc format library=work;
value dspl
low - 3.25 = '2.5'
3.25 <-  6.25 = '5'
6.25 <-high= '7.5'
;
run;

proc freq data= data.tmpg noprint;
   tables trt*dspl/outpct out=work.summary;
   format dspl dspl. ;
run;


proc sgplot data=work.summary;
   vbar dspl  / group=trt  groupdisplay=cluster
                response=pct_row
   ;
   label dspl='Displacement'
         trt='Treatment'
         pct_row='Percent of Treatment'
   ;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 760 views
  • 4 likes
  • 4 in conversation