I am trying to create a histogram of a continuous variable and to be able to see the data behind the histogram. If I understand correctly I need to use proc univariate with outhistogram= option. I have seen sas documentation where this produces a percent of obs within each bin and a count as well, however I seem to be unable to obtain the count. My code is as follows:
proc univariate data=exposure1988;
histogram experience / endpoints = 1 to 50. by 1
Well, did you do a PROC PRINT on the WORK.PRINT dataset or look in the SAS Explorer for the dataset?????
When I do this (note that I changed the name of the output dataset to avoid confusion with PROC PRINT being used to print the file):
proc univariate data=sashelp.class;
histogram height / outhistogram=work.outhist ;
proc print data=work.outhist;
title 'Histogram Data Output';
Then, when I look in the LISTING window, I see the following results from the PROC PRINT:
Histogram Data Output
And if you look at the description of how ENDPOINTS works, it may be that what you're specifying is incompatible with your data: (below quoted from doc)
ENDPOINTS <=values | KEY | UNIFORM>
uses histogram bin endpoints as the tick mark values for the horizontal axis and determines how to compute the bin width of the histogram bars. The values specify both the left and right endpoint of each histogram interval. The width of the histogram bars is the difference between consecutive endpoints. The procedure uses the same values for all variables.
The range of endpoints must cover the range of the data. For example, if you specify
endpoints=2 to 10 by 2
then all of the observations must fall in the intervals [2,4) [4,6) [6,8) [8,10]. You also must use evenly spaced endpoints which you list in increasing order.
determines the endpoints for the data in the key cell. The initial number of endpoints is based on the number of observations in the key cell by using the method of Terrell and Scott (1985). The procedure extends the endpoint list for the key cell in either direction as necessary until it spans the data in the remaining cells.
determines the endpoints by using all the observations as if there were no cells. In other words, the number of endpoints is based on the total sample size by using the method of Terrell and Scott (1985).
Neither KEY nor UNIFORM apply unless you use the CLASS statement.
If you omit ENDPOINTS, the procedure uses the histogram midpoints as horizontal axis tick values. If you specify ENDPOINTS, the procedure computes the endpoints by using an algorithm (Terrell and Scott; 1985) that is primarily applicable to continuous data that are approximately normally distributed.
If you specify both MIDPOINTS= and ENDPOINTS, the procedure issues a warning message and uses the endpoints.
If you specify RTINCLUDE, the procedure includes the right endpoint of each histogram interval in that interval instead of including the left endpoint.
If you use a CLASS statement and specify ENDPOINTS, the procedure uses ENDPOINTS=KEY as the default. However if the key cell is empty, then the procedure uses ENDPOINTS=UNIFORM.
Are there any messages in the LOG??? If I understand you correctly, you DO get an output dataset (WORK.EXP) but the _COUNT_ variable is NOT in the dataset????
I'm stumped. I am running the following version of SAS:
SAS Version is: 9.02.02M2P09012009
What version of SAS are you running? If you run -exactly- my code, using SASHELP.CLASS, do you get the same results as I posted??? To find out your version of SAS, submit the following statement:
%put SAS Version is: &sysvlong4;
And then look in the SAS log for the results. If you run my code for SASHELP.CLASS and do NOT get the _COUNT_ variable; or, if you run my code and do get the _COUNT_ variable, but then switch to your data and do NOT get the _COUNT_ variable, your best resource is to open a track with Tech Support.