If you just want a HISTOGRAM why are you running PROC UNIVARIATE instead of the appropriate graphics procedure, like PROC SGPLOT with the HISTOGRAM statement?
What do you mean by "table"?
Sounds like you want to use ODS OUTPUT to convert this TABLE (tabular report) in the output of PROC UNIVARIABLE
into a DATASET?
And what is your question or your goal?
Do you want to know if there is a way to change PROC UNIVARIATE so that it produces a different number of bins?
Do you want to understand how PROC UNIVARIATE decides how many bins are required?
@michal_1407 wrote:
Hi,
I want to understand how PROC UNIVARIATE decides how many bins are required.
Only this
Univariate has been around for a very long time, the first time I used is was 1987 and it wasn't new then, so there are very many options available to interact.
If you check the online references you will likely see repeated references to
the procedure computes the midpoints by using an algorithm (Terrell and Scott 1985)
which in the references listed becomes:
Terrell, G. R., and Scott, D. W. (1985). “Oversmoothed Nonparametric Density Estimates.” Journal of the American Statistical Association 80:209–214.
@michal_1407 wrote:
Thanks, I saw this paper, but still I have different number of bins in SAS than in paper and I want to understand how SAS do it.
I don't have access to the paper. Can you show your work? How did you pick the KEY cell? (or for that matter how does PROC UNIVARIATE pick the KEY cell?) Or id you ask it to just use UNIFORM bins?
It looks like PROC UNIVARIATE can output a number of statistics that from their names might be related to that paper. Perhaps you could see if using those in the formula shows how it determined the number of bins.
Also note that the particular ODS output table you selected does not include empty bins, at least it did not include empty bins at front or back in the examples I tried. Is that confusing your calculations?
If you want to get the HistogramBins table ,try the option outhistogram= :
proc univariate data=sashelp.heart; var weight; histogram weight/outhistogram= histogram; run;
And @Rick_SAS might give you a hand.
https://blogs.sas.com/content/iml/2023/05/01/overlay-curve-histogram-sas.html
The histogram bin widths (and therefore the number of bins) are not only determined by n, the number of nonmissing values, but also by choosing bin widths that are "convenient", as described in Lewart (Algorithm 463 of the Collected Algorithms of the ACM, 1973). You can get the bin locations from Lewart's algorithm by using the GSCALE subroutine in SAS IML. For detail, examples, and a discussion, see The location of ticks in statistical graphics - The DO Loop
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.